November 11th, 2014
A while back I had a small digital ocean instance running elastic search as part of an app. There was a big vulnerability in the dynamic scripting module that allowed arbitrary code execution. My instance got hacked, but it was around two months after the vulnerability disclosure, but I never heard about it.
I was wanting to try out the Phoenix framework in Elixir at the time, so I put together a little app that monitors security disclosure lists and will notify you when a package in your application has a vulnerability. If your packages are managed through a package manager like apt, npm, pypi, or rubygems, then it's really simple to get notifications when there's a disclosure affecting you.
The site is called vuln.pub. You can create a "monitor" which has a manifest describing the dependencies in your app. vuln.pub will periodically poll your manifest for changes in your packages and package versions. When a vulnerability comes in from a seclist, it will find all the packages you have installed, and if their version is vulnerable, send you a notification.
Elixir was surprisingly pleasant, much more so than node. I think I'll be writing a lot of stuff in the future with it.
Stuff used: elixir, phoenix, postgres, ecto, browserify, influxdb, gulp
June 14th, 2014
I don't even know how I came across aquaponics, but it seemed like something that would be fun to overcomplicate. Also fish are cool and basil is delicious.
I built a little ebb and flow system that is controlled by raspberry pi. The pi controls the pump and the valve which it runs on a timer to fill and drain the grow beds. The pi also monitors pH via a pH sensor that runs on i2c. There's also a water temperature sensor that collects data. All the readings are displayed on a web page and graphed.
Everything that runs the sensors and the web console is here.
February 22nd, 2014
I heard about the US Census API and was looking around inside it. There are thousands of cool datasets within a well executed API, but it's pretty hard to navigate through because of the scale. I thought it would be cool if there was a way to organize them and maybe provide a simple visualization for each dataset. I created a thing called Census Explorer that attempts to do that. Hopefully it will make the datasets more accessible. Right now it consists of the SF1 and ACS5 datasets from the 2010 Census.
I pretty much just dumped all the dataset descriptions into an Elastic Search index, and provided a simple api for getting at it. The web interface is intentionally minimal, as this is really just an experiment-weekend project. The source lives here.
Stuff used: flask, elasticsearch
November 10th, 2013
People have been sending a lot of good links over meatspaces. There are a lot of good tunes, sites, etc sent, so I made A simple Thing to listen to the meatspace chat and save the links. The source is here.
Update: Recently added a radio feature. The bot will pull out any soundcloud links or youtube music from the conversation and index it. If you click on the link in the top right of meatlinks, an embedded player will show up and will choose a random song to play, and then move onto the next. So it's a little shuffle based player that plays music that has previously been sent. Sometimes the category of a youtube song isn't set as "music" so to explicitly add a youtube link you can include "musicbot" in the meatspace message.
Stuff used: flask
August 4th, 2013
After purchasing tumblr, it didn't take Yahoo long to start messing it up. Yahoo allegedly adopted some fairly strict filtering of content, so based on the content of your blog it could be blocked by the site's internal search as well as external (ie: google) searches. I built an application that crawls tumblr and builds a list of the unindexed blogs, and then made it searchable. To be honest, I don't really understand tumblr, and I can't tell the difference between spam and not-spam, since everyone is just reposting everyone else's (mostly nsfw) posts anyway. And Yahoo alleges that this is mostly a spam filtering strategy. Nevertheless, it was a policy change that got people worked up and it was sort of a fun sunday evening project.
(link removed, i took the server down)
Stuff used: scrapy, django, backbone.js, elasticsearch
May 31st, 2013
The folks at the Chicago Tribune built a load testing utility called Bees with Machine Guns which is a nice little tool that starts a number of EC2 instances, then hits a URL a bunch of times, and then shuts everything down after generating a report. It's a cheap, easy, and realistic way to load test your site. I've been using Digital Ocean for a few months now, mostly because it's cheaper, but they also provide a nice API, so I modified bees with machine guns to run on digital ocean rather than AWS. The project is called Minnows with machine guns.
May 28th, 2013
I wanted to put a bunch of LEDs in the ceiling of the 73 VW bus I have. I finally put the arduino to use in this project. The arduino controls some (4) TLC5940 LED drivers, and an Android through an app I wrote communicates with the arduino via serial bluetooth adapter. There are 64 LEDs and each is addressable individually, and can be faded on or off. It's pretty basic but it was the most elegant way I could think of lighting up the inside of the bus. The end result was something I'm pretty happy with...it works well and looks pretty cool.
The result is here
I also wrote some stuff to measure the cylinder head temperature on the engine via the stock fuel injection temperature sensor. The temperature sensor is just a resistor that changes its resitance as the engine heats up. I found the following graph on Ratwell's site and then fit it to get a function to convert the resistance to a temperature. The temperature displays on the phone app. In the future I'm going to be measuring the main and aux battery voltages, as well as the RPM via hall sensor.
In the future I might ditch the arduino for something faster like a raspberry pi, as there are some pretty annoying performance issues with generating tons of serial interrupts. However, the arduino is neat because it's very low power.
The stuff used:
April 16th, 2013
It's time to move again. That means I've been browsing the Craigslist apartment and housing section more than I'd really like. Looking for housing in Vancouver already blows, so I didn't think it could get much worse. Since my girlfriend has a dog that she (and I) would love to have live with us, this meant checking the "allows dogs and cats" box on the Craigslist search form. As soon as you do this, you may notice that there are like 2 listings that meet that criteria. That sucks...but surely it can't just be relegated to Vancouver. After all, this is the place with an organic, free range, grass fed, fair trade dog food store on every block, so there's no way it can be unfriendly towards pets, right? People in Vancouver must really love dogs and cats.
Question: which cities are the friendliest towards pets?
I looked on the Seattle Craigslist and saw that there were plenty of ads that allowed pets, so I decided to take it a step further. I wrote a little script to look at the Craigslist apartment and housing listings for major cities. It put the listings into buckets based on date. It just compares the number of postings that allow pets to the total number of postings for that particular date. Then it averages the dates together, and you get the percentage of postings that allow pets, by day. Initially I wasn't expecting any significant differences between cities, but the results showed something else (damnit). Each city had about 2500 listings taken into account, and these are based on current (April 16, 2013) craigslist ads.
TL;DR : If you want to have a pet in Vancouver, then move to Seattle
March 12th, 2013
As part of a project I'm working on in my free time, I needed to figure out corporate relationships. The SEC requires that all publicly held corporations file a list of their subsidiaries in their form 10K each year. So by scraping a section (called exhibit 21.1) in the 10k document, you can extract a list of subsidiaries from that registrant. The issue is that every company files their 10k in a different format, and lack of uniformity makes scraping a lot harder. Moreover, it says nothing about privately held companies. Anyway, I did my best and it manages to extract a lot of information.
I made the project lightweight and separate from any storage backend, so I should be able to easily integrate it back into the larger project I'm doing at a later date. Also, I was hoping that others might find it useful. It's a little bit out there though, so who knows.
It's up on Github here
Built with python
February 20th, 2013
The aircooled VW community is pretty chill. There used to be a site that listed contact information of people willing to help out a travelling aircooled VW owner should there be a mishap on the road. The old site wasn't being maintained anymore, so I made my own. I scraped all the old info from the previous site, and made the new site accept registrations, rather than running each listing by the webmaster. Then all that info gets plotted on a map.
You can check it out here
Built with django, backbone.js, and bootstrap
February 2nd, 2013
Figured it would be possible to map the posts on reddit's earthporn subreddit by geocoding the post titles. Then you can see where the posts are geographically, which is nice for discovering pretty stuff around you.
(link removed :()
Built with python
January 13th, 2013
Thought it would be interesting to visualize the connections of Hubski in a different way. Though the graph should be directed, I wanted to keep it simple, so right now it’s undirected. Size represents the number of followers someone has. Let it settle for a couple seconds, then click the “Stop Layout” button. You can zoom with your mouse wheel. Mouse over a user to eliminate all users that aren’t directly following or followed by them.
[link removed, sry :(]
Built with python and sigma.js
September 12th, 2012
While reading a paper for class, I felt compelled to try my hand at implementing the approach they took. A lot of times I read things, they make some sense, but I don’t really know how much I don’t know about them until I stop reading and try doing. The paper is called Finding and Evaluating Community structure in networks, from 2003.
I read the paper a couple months ago, and the other day started thinking about all the uses for a means of picking out communities within a larger network. Basically, their paper says we should calculate the betweenness of each edge in a graph, and then remove those edges with the highest betweenness. If betweenness is a measure of how often an edge is crossed on a path for every pair of nodes in the graph, then we’ll be removing the edges that are most commonly crossed on a shortest path from node a to b. Eventually, the original graph is split up into smaller graphs, which, from their perspective, carry greater similarity between nodes.
So thinking about this in terms of a real community, I figured, two subreddits, a and b, are connected if a user has two comments c1 and c2 that live in a and b. So this constitutes an edge in the graph, where a node is a subreddit. I used the reddit api to pull some submissions and comments down, where I then constructed a graph. I figured it would be neat to evaluate this in large sets of data, so I committed the graph to a redis instance. When the data is downloaded, a python script loads the entire data set from redis and begins classifying (the fun part!) communities. The following occurs:
- Calculate every shortest path for every pair of nodes in the graph
- For each node in the graph, find the fraction of paths that contain that node vs how many don’t. This is betweenness (as per wikipedia’s definition)
- Remove the edge with the highest betweenness (just the betweenness of that start and end nodes added together…maybe this assumption is flawed)
(source is in the depths of my computer, somewhere..)
Built with python and arbor.js