Introduction
Taking the plunge and learning to program is probably the single most useful thing I’ve done for myself. Originally I had been a little hesitant to start, but now I can see that coding has opened up so many new horizons. Almost every task in engineering requires some element of coding, and I can’t imagine where I’d be today without it. That said, I’m mostly self-taught. Whenever a task calls for coding, I buckle down and boot-camp until I can do what’s needed from the ground up. Case in point this portfolio - I had zero web-dev experience coming in but spent the time to learn enough css and nodejs to cobble together what you’re seeing now.
I’m proficient in Python and specialize in MATLAB, but I’ve always been capable and more than willing to pick up new languages on the spot as needed. My next goal is to learn Julia, which is similar to Python but as fast as C. A lot of my coding experience so far has been math heavy, so I’ve just chosen to include the more here are some of my more visual outputs. All code and algorithms were self implemented.
Hough Transform and Shape Detection
Example Input | Example Output |
---|---|
The Hough Transform is one of the major methodologies for geometry detection in any sort of real time computer vision or image analysis. Above is the algorithm being used to detect lines in a control image. This is done by first preprocessing the image with sharpening and Canny Edge Detection. A voting procedure is executed to find the strongest lines and graphed in a line parameter space, such that every point in the cartesian plane is a line in the Hough space, and vice versa. The points in the Hough space with the most votes are the are then the lines in the original image. The associated Hough space is pictured below with axes theta and rho (Note that the top peaks are equivalent to the bottom peaks as theta ranges from [-180,180]deg):
Applying this to a real image, There had to be a lot of non-maximal suppression and other filtering. Circle detection works similarly, except the new Hough space parameters include radius and center coordinates. In the following image, the goal was to detect solely the pen and coin geometries while suppressing other false boundaries:
Real Input | Real Output |
---|---|
Unfortunately I did miss out on that one pesky coin. I think that the shadows made it look like an ellipse, which my algorithm was not configured to (although I did do ellipse detection later!). The Transform can be generalized to any geometry and produce pretty awesome images. I highly recommend you google Hough space images!
Disparity and Depth Mapping
Disparity is another common technique in computer vision. By taking two images from very slightly different angles, you can create a depth map by utilizing epipolar geometries. It’s super useful in controlled environments, but the major disadvantage is time complexity O(N*W) for W window pixels (Hough Transform has a space complexity issue - O($N^W$) for W parameters). Here is a quick and dirty algorithm I made that took about 15 minutes to compile as disparity brute force calculates the position shift kernel of every pixel. Darker means further back:
Input | Output |
---|---|
Movement Detection via Lucas-Kanade
Similar to disparity, motion can be detected by calculating the image gradients between frames. To help with resolution, the Luca-Kanade was computed iteratively across increasing image resolutions - this helped differentiate between and represent both larger and smaller motions. It takes a bit of linear algebra but the results can be super useful. Here is the result of running my implementation on 3 frames of a video of a girl playing with her dog:
Frame 1 | Frame 2 | Frame 3 |
---|---|---|
The results show frame to frame motion, and total motion from frame 1 to 3 is shown in the third column. Specifically we can tell that the girl is primarily moving right and down (likely a result of bobbing while she steps) and the dog is move left and standing up straighter as it pivots with the girl. Pretty useful information in more technical areas where even the slightest movements matter!
Harris Detection and Homography Computation
The final bit of computer vision I’ve done is detecting corresponding points between two pictures through Harris Corner Detection, and then calculating the overall homography transformation between the two. Essentially, If all I have are two 2D pictures of the same scene taken from different positions and orientations, I can derive 3D world information about what those different positions and orientations are. This helps a computer tell where corresponding points are in each picture, which is integral to image stitching (panoramas) and projective calculations.
2-dimensional gradients were are calculated in each picture to form a pre-curated list of interesting points. The list along with magnitude and orientation information was fed into scale invariant feature transform (SIFT) to match the points together, and random sample consensus (RANSAC) (another voting procedure) was used to determine the transformation homographies between the images allowing for point matching.
Facial Analysis with PCA and KNN
Principal Component Analysis and K-Nearest Neighbors are staples in any statistician’s tool kit. One really interesting use-case of the two used together is random face generation. Using 40 subjects (10 samples each) from Oxford Olivetti Face data set, I used PCA to extract ~325 features (‘eigen faces’), accounting for 99.5% variability, from the covariance matrix and mean face of the training set.
Mean Face | Variability by Features |
---|---|
K-Nearest Neighbors was then implemented to evaluate the feature quality and achieved 95%. Accuracy dropped rapidly with neighbors, indicating that many of the faces shared common characteristics.
The technique is the basis of amazing (and honestly kind of scary) applications like these: Which Face is Real? and This Person Does Not Exist. The first is a comparison game, and the second generates an artificial person everytime you refresh.
Paddington GroupMe-Speech
The rest of the projects have been pretty math and algorithm heavy, but this one is more traditional “make something do something.” It started off with a friend and I goofing off in class - we made it a point to only communicate with text to speech (courtesy of Natural Reader TTS) during lab time. Discussion turned to making a web-connected text-to-speech (TTS) stuffed animal - and that’s exactly what we ended up with, meet Paddington.
Paddington is a stuffed bear with head full of code and a heart made of RaspberryPi (ZeroW). The code utilizes 2 major resources: GroupMe API in order to access messages information, and Microsoft Azure Cognitive Services (MACS) to convert text to speech. With the former, it’s easy to select a particular channel and monitor messages coming in. Do so in a loop and add a couple of filters for valid messages forms the basis of the program.
When reading a message in, user information is noted. Administrators are hard-coded in and have access to a full list of commands, such as changing the target channel, banning and unbanning, and toggling TTS status (all through the basic GroupMe UI). Other users may set a temporary account with a simple command. Accounts enable you to set choose a specific voice that is used whenever a you specifically send a message. Those without accounts all share a general guest account.
Once a message read and passes preliminary spam filtering, the user voice and message contents are wrapped. MACS is pinged under a free developer plan. MACS offers several hundred voices, but options were pared down to 24 distinct, global accents. The text is converter to a .wav file and then played locally via onboard speakers. I made a fun little demo on the just-because section, make sure to check it out!
Machine Learning
I’m a little late to the band wagon, but I’ve recently been learning different machine learning - based classification techniques. Among the endless types of discriminant analyses, some of the cooler things I’ve done is implement: a Naive Baye’s Classifier with an adaptive reward mechanism for document classification, high-dimension hyperplane Support Vector Machine with kernel method for high accuracy facial recognition, and a shallow Neural Net for handwritten number classification. These don’t have the most visual outputs (unless you like confusion matrices and accuracy heatmaps), so I won’t bog the section down with pictures.
Various Modeling
Logistics Solution Plot | Square Wave Approximation |
---|---|
Here are some various models - Lorentz Attractor, Logistics Equation, and Gibbs Phenomenon (square wave approximation). They weren’t technically difficult (except for making an animated plot in MATLAB), but they were some of my first visualizations practice. Along with a some sorting visualizers (I meant to do more but stopped after bubble, insert, and quick sort), I thought these were small enough to share. Credit to Steve Brunton of the University of Washington (Lab Group, YouTube, Google Scholar) for providing math explanations! download here
Relationship Reminder
This is a pretty short quality of life driven program. I thought it was kind of sad that most relationships seem to end because both parties just forget about each other, and so I just wrote a little reminder system using smptlib. I noted all of the relationships I wanted to keep going and how often would be appropriate based on how close we were. Pandas was used for data processing and formatting, and then I use a proxy account to email myself. The script runs off an old Macbook that I use as a server to send automated reminders every Sunday.
Desktop Popup | Phone Notification |
---|---|
Pong
Obligatory beginner’s pong