flowerhack: (Default)
2015-07-23 02:36 am

Drinking from the Information Firehose: A New Approach

Something I've struggled with more and more these days is keeping up with all the stuff there is to read on the internet. Every day, there's an update from a friend's blog, or a cool article on a tech news site, or a post from my favorite blogger… I wind up letting myself get interrupted by all these irresistible information-trinkets, trying to skim each article quickly so I can move on to the next one, and searching for more once I've finished all my skimming, hungry for yet more little factoids.

One solution, obviously, would be to just cut down on the number of blogs and feeds that I subscribe to. I'm working on doing that, but it's a hard problem to tackle all at once—how do I decide which feeds are valuable, and which are useless? how do I handle a site which I really enjoy, but has a low signal-to-noise ratio? and so on.

If I could just filter these articles better, though, that'd be another solution altogether.

And it's a solution that works really well! I've been trying out a new article filtering system these past couple weeks, and while I hate to crow "success!" prematurely, I do feel like I'm spending less time "keeping up," and the articles I'm reading are more useful, and I don't feel weirdly anxious about missing out on news.

The idea's very similar to inbox zero, if you're familiar with that.

Basically: I browse the internet as I usually would, going through any blogs or RSS feeds and such that I like. When I see an article I'd like to read, though, instead of reading it right away…

1) I bookmark the article. (I use Pinboard for bookmarks but I'm sure other bookmark managers work too.) Most of the time, I try to tag the article with a relevant category right away—for instance, some tags I've used today are "https," "game design," and "security." If I'm pretty sure I'll want to read the article later on that day, it can remain untagged.

2) Once a day, I look at all my untagged bookmarks. All of these bookmarks must be read on the spot, or else tagged with a relevant category for later reading. No leaving untagged bookmarks lying around!

Using this method, I'm down to about 3-7 articles left untagged at the end of each day, which makes for about a half-hour of "keeping up" with the basic news and shorter pieces of the day. That's just about how long I spend commuting on the bus, so I can get my reading done while in transit. Awesome! It's like a little newspaper.

And the articles that were tagged for later wind up being more like little magazines: read less frequently, and specialized into different topic "groups" based on the tags.

Thus, when I'm in the mood to soak up some long-form journalism or some more technical reading (yay, lazy Sunday afternoons), I can pick one of those tags and spend an hour or two reading about a single topic. This lets me comprehend particularly technical articles better—rather than switching between wildly different areas of computing while going through all the "tech news" links of the day, I'll instead sit down and read several articles about, say, the logjam vulnerability at the same time, and do more of the "try it in your own terminal!"-type experimentation that I don't tend to do when I'm just trying to read over the day's news.

And at the end of the day, my "article inbox" (untagged articles) count remains fixed at zero, which leaves me feeling very relaxed and happy indeed :)
flowerhack: (Default)
2015-06-02 06:48 pm

How should programmers share ideas?

A couple weeks ago, I was reading a pretty excellent paper on internet password research. As I read, found myself becoming vexed. Here was an eminently practical paper, giving very practical suggestions that any web developer could act upon right away... and yet I never would have read the thing at all if I hadn't had a friend in academia who happened to give me a link to it.

What other useful stuff has academia been doing without my noticing, and how can I find out about it?

Like (I suspect) most software engineers, I get my industry news/trends/updates/etc from a smattering of populist-ish sources such as: blogs of programmers I admire, blogs of friends, Hacker News (with reluctance), sometimes Slashdot, that sort of thing. If the password security paper had wanted to spread its ideas via one of those channels, it seems like it'd be pretty doable. Simply extracting the "take-away points" section of that paper into a blog post would make for solid reading, and a link to the full paper could be included for the curious.

I know that I would love it if more academics shared casually-worded summaries of their papers. Even when I'm planning to read the full paper, I'll try to ask someone else "so what's this actually about" before I do. Usually, the on-the-spot, casually-worded summary is more transparent than the abstract, and helps me direct my reading better.

There's also an argument to be made that perhaps software engineers as a whole should really have a source for industry news that doesn't involve upvotes or Randos On The Internet. I would be tremendously curious to hear from engineers in other fields about this—has some other group found a tidy compromise between social news sites and impenetrable academic journals? I suspect this sort of compromise is what the ACM Queue is trying to accomplish, but I only know about the Queue because of, well, another friend in academia. Maybe the Queue just needs a better PR department?

In any case, while not every field seems to be affected by social-industry-news-sharing the same way software is (medicine, for instance, requires that doctors complete X hours of professional development, done via formal exams, classes, and so on), this isn't a problem solely afflicting software engineering, either. I was prompted to write this post after stumbling onto an article about the Volokh Conspiracy, a blog by a handful of legal scholars that apparently has become just as influential as major law journals in shaping US legal thought—arguably moreso, since they can offer faster feedback than the journals can. And as far as I can tell, arXiv has become the "open beta test" for papers in fields like math and astrophysics as well as CS—and I'll sometimes see arXiv posts linked on Facebook or whatnot.

I'd love comments from folks who understand how other fields handle this dilemma, or who have cool ideas I haven't thought of yet!
flowerhack: (Default)
2014-12-02 11:48 am

A Python Internals Adventure

I spent a little while digging around in CPython recently, and thought I’d share my adventure here. It’s a bit of a riff on Allison Kaptur’s excellent guide to getting started with Python internals—I thought it may be neat to show how my own explorations went, step-by-step, so that perhaps other curious Pythonistas might follow along.

Read more... )
flowerhack: (Default)
2014-10-30 08:47 pm

lol APIs

This is possibly a sign I'm a bit sleep-deprived at the moment (I did the waking-up-early-to-go-birdwatching thing this morning), but I found this bit from the Flickr API docs for their "photo search" function immensely charming:
[parameter:] accuracy (Optional)
Recorded accuracy level of the location information. Current range is 1-16 :
    World level is 1
    Country is ~3
    Region is ~6
    City is ~11
    Street is ~16
Does this mean Flickr, at its lowest accuracy level, can distinguish between "photo taken on the moon" and "photo taken on earth"? That is the "world" level, after all... :)

I've been super-quiet on the Hacker School blogging and I hope to resume that soon; I've been so busy hacking and learning that I keep forgetting to blog, oops. Suffice to say I've been doing some rad stuff: yesterday I implemented a bitflipping attack on CBC mode encryption, today I spun up a quick Flask app that lets you search Bing via text message, now I'm working on a birding quiz app I've been planning to work on "someday" since April (eep!), and in between all that I've been learning Rust and RUST IS DELIGHTFUL FUN. I'll blather all about it in a post, for sure!
flowerhack: (Default)
2014-10-23 04:02 pm
Entry tags:

The Internet I Knew

Today the upstart social networking site Ello reaffirmed their promise to never sell user data or ads. Which is good for them, I suppose, but the following line from their announcement made me frown:
With virtually everybody else relying on ads to make money, some members of the tech elite are finding it hard to imagine there is a better way.

But 2014 is not 2004, and the world has changed.
We... we had ad-free social networking in 2004. It was called "one of your friends got a Dreamhost and put some forum software on it and everyone hung out there." If the website got really big and popular, maybe the owner would ask for donations from the users, and usually folks would give enough to keep the place afloat, because everyone wanted to keep hanging out there.

It wasn't glamorous. It didn't give anyone rounds of VC funding or make anyone rich. Sometimes the site would crash from some "IPS driver error" and a grumpy teenager with the heart of a future sysadmin would crawl onto AIM at 2AM to tell everyone they were working on a fix.

But we existed. And for some reason I can't help but feel a little slighted. Ello didn't invent the concept of people hanging out online without ads. (Take, for instance, the very site you're on now, Dreamwidth: another great example of a community bootstrapping and sustaining itself.)

I had similar grumpy feelings when Pinterest was blowing up a few years back—not because of any ill will toward Pinterest, but because of the breathless, astonished tone reporters seemed to take when talking about Pinterest. In particular, they seemed staggered by the fact that the site's users were almost all women, bringing them together in a way never seen before, and how did Pinterest discover the secret of drawing women to the internet?!

And yet, the "social networks" I hung out on during my preteen and teenage years were composed almost entirely of young women. I'm not even sure why that was the case—we talked about gaming and tech a lot, which were supposedly "guy" interests when I was a kid—but it was a prevalent enough gender skew that, on the rare occasion when someone joined with an obviously male handle, we'd joke about how "but there are no boys on the internet!" We were there the whole time; we didn't just starting using the internet when Pinterest came out.

I suppose it's the difference between a Social Network TM in the Facebook and Google+ sense, versus the "social networks" I remember. Those "social networks" were small, and never made front-page news (or any news at all), and were more concerned with keeping to themselves than recruiting new members. They were "social networks" in the "people getting together and hanging out" sense. But Social Networks TM are big, and self-promote, and have money and influence, because there's a lot more people on the internet nowadays and more money to be made.

Which is fine. I just don't think it should be billed as this Totally New Thing. All sorts of folks have been on the internet for a long while now. Let's acknowledge that, at least a little.

Also of interest: Paul Ford's tilde.club and "how LGBTQ nerds helped create online life as we know it."
flowerhack: (Default)
2014-10-16 12:48 am

Day 7: Python metaprogramming, or: implementing `namedtuple` the wrong way

I was reading Allison's blog post on how to start exploring Python internals, and one of the suggestions was: try implementing a Python library function without looking at it! I thought this sounded like splendid fun; also, one of the suggestions was namedtuple and I actually REALLY LIKE namedtuple but don't have occasion to use it often enough. So I dove in! Stuff I learned so far doing this:
  • Metaclasses! I already knew about these in a vague "it's like a thing that creates classes or something" sort of way, and since it seems like namedtuple creates class-like objects, I thought it'd be a good place to start. Probably the most interesting thing I discovered: the plain old type method, which I've always used just to check the types of objects, can also be used to dynamically create new classes! This seems like a super-odd and unintuitive dual functionality, and I found a throwaway comment that claimed this was due to historic/backwards compatibility reasons, but I wasn't able to determine what these reasons were. (Let me know if you know!)

  • With type() alone, you can create a pretty decent named tuple, which I coded up like so. Granted, it's (a) not a tuple at all, and (b) does some slightly frownyface manhandling of class properties, and (c) doesn't implement all the functionality of namedtuple... BUT, it does handle my most common use case for namedtuple, which tends to be: "Hey, I want a kind-of-throwaway class that'll be used only in a small section of the code—but that throwaway class will make what I'm doing SO MUCH MORE READABLE." Thus, tada! Instant objects with sensible properties!

  • But for some reason I got to wondering: could you make a function that, say, knows to simply create a Foo when you call namedtuple('Foo', 'my properties'), rather than having to do Foo = namedtuple('Foo', 'my properties')? It turns out the answer is YES, but you have to do evil things to make it happen. Essentially, Python maintains dictionaries of variables for you—try typing globals() or locals() into your Python interpreter to see!

    In order to auto-generate our Foo class, then, we want to add Foo to the local variable dictionary of the caller. (Meaning: if we're calling namedtuple('Foo', 'my properties') within our main method, we want Foo to be created in that main method, not just within the namedtuple call.) Turns out there's a _getframe function you can use to get, say, the current frame, or the parent frame... and then just tack Foo onto the parent frame and you're good to go!

    But that's all a terrible idea and you shouldn't do it. It's not good for you. It's not good for the planet. Don't be like me.
I've got an actual, good-for-the-planet implementation of namedtuple underway, so hopefully I can share a real gist of that with you all soon!

Edit: Ned pointed out that the super(self.__class__, self).__init__() call I had in my init functions for the janky and trolly tuples wasn't quite right—calling super on our hand-rolled class gets us a NoneType, so it doesn't really make sense to call it. I updated the code to be more correct now. Thanks, Ned!
flowerhack: (Default)
2014-10-09 11:24 pm

Day 4: Presentations and such!

Crypto challenge update: I can now decrypt repeating-key XOR and detect ECB encryption, woohoo! Now that I'm done with the first "set" of challenges, though, I think I'll take a bit of a break—they're super fun, and I'll come back to them later, but I want to start pairing more and explore some other things, too.

Tonight there was a round of presentations from other Hacker Schoolers and goodness they were awesome. Highlights included: Allison poking around to see how the recursion limit is implemented in Python and discovering amusing details therein, Eunsong's Javascript-based molecular dynamics simulator, and Tanoy demonstrating both his live coding skills and his excellent taste in music by making a Jekyll blog and dropping it on Digital Ocean in less than the amount of time it takes to listen to one rap song.

To wind down this evening, I wanted to dust off my old Heroku account and deploy a Flask app there (I've been trying to move some things off my Linode, and this seemed like an easy one to handle), and ran into a bunch of annoyances with key management. The first key I tried to give Heroku was rejected because "that's already being used by another Heroku account," which suggests I've got yet another account on the internet I've forgotten about, oops. The second key I used authenticated fine, but I couldn't push to git—since my git is configured with a different key—so I had to edit a file in .ssh/config, but the change didn't seem to be helping, and eventually I figured out that I had both an id_rsa and an id_dsa key, and I was referencing the wrong one. Sigh, key management. Hopefully I won't forget about the existence of this Heroku account too, heh.
flowerhack: (Default)
2014-10-09 11:03 am

Day 3ish: Bit Twiddling & Raft

Alas, this post is late—I left my computer at Hacker School last night and thus couldn't post until I got back this morning. But I'm talking about what I did during Day 3 so this still counts as blogging every day, right?

Anyway! I got some real headway in the crypto challenges, which was satisfying, though, as one might expect, it turns out twiddling bits in Python is rather annoying compared to something like C. Python tries very, very hard not to let you operate on raw bits, so you end up doing a lot of awkward conversions. Like, for the task of "this hex string has been XOR'd against a single character; figure out what character that is," I would up with some code that looked like this...
[chr(ord(byte) ^ key) for byte in hex_str.decode("hex")]
...which is (1) decoding the hex string, (2) reading that one byte at a time, (3) XOR'ing the value of the byte against the key, and (4) converting that back to a character representation. I wound up fumbling a bit getting those conversions nested correctly... I'm hoping to think of a more "systematic" way of handling these soon, maybe like a unicode sandwich for bit-twiddling. Or I could just convert everything to bitarrays and handle the problems that way; we'll see.

I also spent the afternoon reacquainting myself with my faltering early attempt at implementing Raft in Python, which was last updated, uh, seven months ago. I hadn't realized I'd left it abandoned for so long! Definitely hoping to wrap that project up (or maybe just start over from scratch) before I leave New York...
flowerhack: (Default)
2014-10-07 09:21 pm

Day 2: Monads! and other things

First, a follow-up on yesterday's lulz with the eBird data: I lied a bit when I said it was a tar file that was being troublesome; the initial download was a tar file, which decompressed to a few README-ish files and a gz file, but the actual trouble came about when I tried to decompress the gz file—which contains the actual data, and was causing the trouble.

I decided to see what gzip thought the size of the file should be when uncompressed, and, uh...

dhcp-0059526637-5b-99:ebd_relAug-2014 flowerhack$ gzip -l ebd_relAug-2014.txt.gz
         compressed        uncompressed  ratio uncompressed_name
         7232458369          2856865220 -153.2% ebd_relAug-2014.txt


Apparently gzip thinks my massive text file should be smaller once it's uncompressed??? (And definitely not >60GB like it tried to do?)

Read more... )
flowerhack: (Default)
2014-10-06 10:17 pm

Hacker School Day 1: Project Ideas & Getting Started

I decided I'd like to try and blog every day while I'm at Hacker School. This will make my blog updates a bit spammier than I normally like, but it also seems like a fun way for me to track my own progress and share what I'm up to with various interested parties, so!

What I'll be working on! )

What I worked on today! )
flowerhack: (Default)
2014-09-25 04:23 pm
Entry tags:

What do I have to do to keep my website up?

The other day I was chatting with a non-technical guy who was curious about web development, and he asked me: "Say I make a website, and it has everything I want, and I just want it to stay on the internet. What all do I have to do to keep that website up? Is it a daily maintenance thing? weekly? yearly?"

I actually thought this was an interesting question, because there's so many variables at play here, and most of them aren't particularly obvious when you're just starting out. For fun, here's a rundown of some of the perils you'll face if you're "just" trying to keep things running.

Read more... )

I guess the big lesson here is that, lots of folks imagine a web site is something like a painting. Someone paints it, you buy it, and then you invite all your friends over to admire how pretty it looks in your living room, and that's that. But it's more like building a house. Or rather, it's like building a house in some alternate universe where time is going by ten times as fast. All these people keep coming in and scuffing up the floor! There's too much junk in the attic! There was a huge flood and now some of the wood's starting to rot away! The kitchen sink is broken!

You can avoid worrying about a lot of these problems, by paying other people to take care of your house, but at the end of the day, someone is doing a lot of work to just keep your site running.
flowerhack: (Default)
2014-05-19 09:44 pm
Entry tags:

Why I Like Vagrant

Today, I was talking to another developer who works with me on a certain OSS project, and they mentioned that they preferred setting up their development environment by hand, rather than using the Vagrant file the project provides. This surprised me, because I've been using Vagrant in various forms since last September, and it's been such a delightful experience that I think everyone should try it. (Or, if you're involved with an open source project that still expects developers to do all their installing and provisioning by hand—consider Vagrant! They may thank you for it.)

The Problem )

The Solution )

Why not use Vagrant? )

It's not all ponies, of course. Whoever goes about writing the Vagrantfile needs to be thoughtful about it, and a few weird bugs can surface from time to time due to Vagrant magic. (Once I ran into a bizarre test failure that was ultimately caused by a unicode encoding issue that could only occur by running Ubuntu on an OSX host machine with NFS file sharing. I should do a writeup on that one sometime.) But on the whole, it makes the whole business of handling VMs much simpler and cleaner and hassle-free, so you can spend more time developing and less time mucking around in dependencyland.

You can check out Vagrant here!
flowerhack: (Default)
2014-04-13 08:58 pm
Entry tags:

Review: Sea of Fertility Tetralogy

Ordinarily I'm a little shy about writing up book reviews, but I was surprised at how difficult it was to google for a comparative review of the four books Yukio Mishima's Sea of Fertility tetralogy.  I would've very much liked to have seen more folks' thoughts on how each of the books compared to each other before I set out to read this series.  So, for the sake of Making the Internet a Better Place™, I now offer my thoughts.

(There are spoilers—none so major as to spoil enjoyment of the books, I think, but I tend to not mind spoilers as much as the average person, so take that for what you will.)

Spring Snow )

Runaway Horses )

The Temple of Dawn )

The Decay of the Angel )

Ultimately, I feel that perhaps Mishima is at his best when writing about younger people. His work is romantic in character, full of bombastic, colorful prose that feels well-suited for his younger characters but a bit strange when they begin to age. And the subject matter simply fits his favorite themes better—young men acting rashly and nobly is inspiring; seeing aging men behaving the same way often seems more like erraticness and stupidity, and Mishima does not always handle the distinction elegantly.

Overall, I'd recommend Spring Snow to anyone, and recommend Runaway Horses if you really enjoyed Spring Snow. The Temple of Dawn is skippable, and The Decay of the Angel has its moments if you wish to see the series's conclusion.
flowerhack: (Default)
2014-03-20 09:20 pm
Entry tags:

On Devpain

With apologies to Khalil Gibran.

Your pain is the breaking of the barrier that encloses
your abstractions.

Even as the rebase of the branch must fail, that its
user may resolve the merge conflict, so must you know pain.

And could you keep your mind in wonder at the
daily miracles of your compiler, your pain would not seem
less wondrous than your joy;

And you would accept the posts on your bug tracker,
even as you have always accepted the POSTs that
pass through your servers.

And you would watch with serenity the
rejections of your pull request.

Much of your pain is self-made.

It is the unit tests by which the engineer within
you sanity-checks your bug-ridden code.

Therefore trust the tests, and fix their failures
in silence and tranquility:

For their hand, though heavy and hard, is guided by
the hasty hotfixes of the Unseen,

And the test coverage they bring, though it halts your build, has
been fashioned of the code which the Programmer has
filled with his own expletive-riddled commits.
flowerhack: (Default)
2012-12-05 05:01 pm
Entry tags:

Social Networking and the Self

A guy asked me on the airplane what I think of Facebook's future prospects. I told him that, while I think Facebook will still be around, it will lose a lot of viability and relevance in the future—possibly within five years, definitely within ten years.

I've held that opinion for a while, but as I was talking with this guy, I managed to get a sharper insight as to why I think this. But for me to properly explain that insight, first I need to provide a short explanation of something that (initially) seems totally unrelated:

Read more... )

See also "The Social Graph is Neither", a very interesting blog entry from the creator of Pinboard that makes some similar points, but also some very different ones, in a much more eloquent fashion than me.