Pah.

posted by luis

Digg hates me, apparently, so my dreams/nightmares of being digg-fucked (pardon my French) will not be coming true anytime soon.

On the (relatively) brighter side, gibbity was mentioned in this Philippine news site, which rather embarassingly quoted me as saying that IceRocket’s search results were "foreign-language-infested." Wonderful. Not even out the door yet, and I’m already putting my foot in my mouth. (I had to read the piece twice before I realized that they had lifted the comment from a blog entry I had written just last week. I must admit that I’ve never actually had anyone refer to me by my last name before, and it’s … perplexing, to say the least.)

And for the record: I didn’t mean "infested" in a bad way. Honest.

Help Me Digg?

posted by luis

 

http://www.digg.com/gaming/People-powered_Game_Discovery

A grand total of 6 diggs is NOT going to get Gibbity on digg’s front page, I can tell you that.

 

Collaborative News, and the Bubble

posted by luis

There’s an interesting discussion on Publishing 2.0 about how the new internet bubble will be brought about by the proliferation of community-created media, because the sheer volume of generated content will undermine the economics of the industry.

The reason why this is interesting is because sites like CommonTimes.org and NewsVine.com are at the forefront of this supposed bubble, and the speed of adoption is such that the traditional media industry isn’t being outnumbered so much as absorbed. Jason Kottke mentioned that blogs in particular are going to be difficult to differentiate from traditional media. He makes the example of sites from Weblogs Inc or Gawker Media, which are blurring the lines between traditional and non-traditional journalism as their readerbase grows.

I think that over the next few years that line will continue to blur, until it will be near-impossible to tell the difference between the two. Consider how an average newswire works: when a story breaks, typically it is shot off to an offshore (probably Indian) journalist to write up the news flash and send that on down the wire to the various agencies subscribed to the service. A more experienced journalist then goes over the story details and writes a lengthier, more in-depth piece based on the facts from the flash. (You can read more about this and other outsourcing initiatives on Thomas Friedman’s "The World is Flat.")

The interesting thing about this process is that there’s very little difference between that, and what you would normally see on NewsVine or CommonTimes. (You could actually combine the up-to-the-second urgency of Digg and the reportage functionality of NewsVine and come up with a very similar workflow, depending on the subject matter.) The key difference here is the people who are writing the analyses, but even there, I think, we are starting to see a lot of overlap. Professional journalists are, after all, simply people with degrees and connections. If they’re very good, they’ll also have reputations to protect, and if they’re really, really good, they’ll have opinions that people will ascribe to and other journalists will be influenced by. Essentially: it’s exactly what happens online, every day. If I read about Cory Doctorow lambasting red-tailed sportive lemurs, for example, I’d be hard-pressed to come up with a decent winning trait for that species as well.

The point I’m trying to make here is that non-traditional reportage mimics its traditional "evil twin" in so many ways that the only real difference is the number of active voices. (I suppose you could also argue that traditional media has fewer entrypoints and a higher standard of quality, but the popularity-driven nature of the collaborative web has many of the same effects, where weaker work is largely ignored and links picked up by "leaders" experience tremendous, server-shaking surges in visitors.)

That "number of active voices" though, is key, because it lengthens an already long tail to the point of oversaturation. That oversaturation won’t just come from third-party outlets either; after all, what’s to stop the New York Times or CNN or any other established news agency from making their own in-house CommonTimes, with reportage from their readers? If they’re worried about being outdated by non-traditionals, then coming up with their own (enforceable) brand of citizen-journalism would seem to be the best way to make sure their market share is maintained.

What conclusions can we come up with from all this? To be totally honest, I don’t know. I do know that the bubble is starting emerge, and it is firmly rooted in media and the methods we’ve developed to generate it. I also know that people will not stop using these methods anytime soon, and it will create a major dissonance in how people can expect to make money from content. It may mean that traditional news agencies will have to start tightening up, in much the same way as the music industry has clammed up against the tide of piracy, et al. As more collaborative news sites pop up, I can imagine the market divvying itself up in to progressively smaller slices and smaller audiences.

I wanted to end this piece with a big finish, but decided to end with a hypothetical question instead: how many journalists do you think 1.06 billion internet users actually need?

[ NOTE: This article was originally published at http://gutter.newsvine.com. Newsvine is currently in an invitation-only beta, so I thought I’d post it here as well. ]

Optimization Time

posted by luis

[ WARNING: Extremely geeky technical content ahead. Commbadges should be firmly affixed and tricorders should be at the ready. ]

One of the Gibbity features I’m most excited about is the ability to explore the games listing by tags. I got the idea from browsing Emusic’s very nice directories last week and decided to implement something similar in Gibbity.

In a nutshell: users choose one of the top 20 tags with which to begin their search. So assuming you chose "pc," you’d then be presented with a page containing all of the games tagged "pc." On the sidebar you’d also find a list of related tags, which you can use to further refine your search, e.g., "strategy," "rpg," "fps," etc. Clicking "fps" gets you a page with games matching "pc" and "fps," with its own sidebar of related tags. Theoretically you could refine your search an infinite number of times (although in practice you’d probably run out of matching games after the third or fourth tag choice).

I was pretty proud of this infinitely-nesting search when I put it together a few days back, but there were some major doubts in my mind whether I could write it in such a way as to be resource-friendly (i.e., it wouldn’t bring the server to its knees every time a few dozen people try to use it).

True enough, the search was a friggin’ hog. I ran a very simple script timer to see how long it took the PHP engine to process the instructions of my sample pc+fps search, and got an average of about 0.3 seconds on each run.

Now, 0.3 seconds may not seem like much, but I can tell you that that is a lifetime for a multi-user web-based application. Most scripts shouldn’t take longer than a hundredth of a second to run, because you are assuming an environment where the server is getting hammered by hundreds of requests per second. (By comparison, Gibbity’s homepage only takes 0.02 seconds to render from top to bottom.)

After trying various things to optimize the matching code, I simply could not get the execution time any lower than a quarter of a second. Finally, I decided that the only way I was going to shave a few milliseconds of this operation was to cheat, and cache the code.

Code-caching is basically what it sounds like: instead of actually searching for games tagged "pc" and "fps," the engine cheats and serves up stored results from previous searches. The idea is that you become very efficient when serving popular searches like the aforementioned pc+fps, although you take a small hit when running obscure searches like say, pc+fps+balloons.

There are a bunch of different ways to implement caching. The (very basic) method I chose was to save the actual PHP array of search results in to a text file, i.e., all games matching "pc" are in a file called tags_pc.cache, and all games matching "fps" are in a file called tags_fps.cache. So when anybody asks for pc+fps, all the engine has to do is look for the two *.cache files, and intersect their values. The beauty of this method is that even the obscure searches (pc+fps+balloons) benefit from the caching, because only "balloons" will actually have to be run.

Now that I had gotten the caching part working, my next problem was figuring out how to keep the cache fresh. Since the cache is totally disconnected from the present state of the site, new games could be added at any time that wouldn’t be reflected in the cache.

My first idea was to destroy and rebuild specific caches every time a new game and/or tag was introduced by a user. So for example, if someone added that new game "Hell Gate: London" and tagged it as "pc," "fps," and "mmorpg," I would destroy and rewrite those 3 tags_*.cache files. That would mean 100% accurate caches at all times, which is great. The problem though, is that popular tags like "pc" have pretty big lists of games ("pc" has 668 games already, and I haven’t even debuted the site yet), so it’d be an equally large resource hog to write and rewrite those caches every time a user makes an entry.

My second idea, and the one I eventually went with, was to have a simple expiration on each cache. So if we run our search and we find that one of the caches are older than an hour, we destroy it, rebuild it and display its contents. That means that one person every hour or so, will get a slow search (in the region of 0.3-0.4 seconds) but everyone else who runs a similar search within that time period gets a free ride. By "free ride," I mean 0.015 second execution times, which are the results I’ve been getting from my testing (Savings of over 90% = good).

The performance increase was so dramatic that I almost forgot that there were still many, many parts of the site that needed optimization, but at least I’m off to a good start :) 

LabanMan

posted by luis

My friend Manny, who I worked with on many, many projects between 2002 and 2005, has cancer. You can read about his ongoing experiences at his blog. You can also donate to his cause by going to the fundraiser below, or contributing money directly to the event’s organizers.

iDeejay_eflyer-(3).gif

Crazy Corporate Meetings

posted by luis

I was at the most ridiculous meeting today, in the conference room of one of Makati’s more prestigious medical centers. In attendance were: 2 senior-level managers, 2 systems administrators, 2 account managers, 1 technician and myself, the long-haired guy in rubber shoes.

The meeting was convened to unravel one particularly testy mystery, which I had been trying in vain to solve over the phone for the past few weeks, i.e., "What is the username and password for your website’s hosting account?"

No one in the room knew the answer.

A lot of hemming and hawing ensued, with both the client and the hosting provider lobbing progressively more cutting remarks at each other in an effort to pin the blame for the loss of this information. It became apparent to me, during the course of all this, that although the managers barely understood what the missing login was even going to be used for, their blame-assignment skills were top-notch, and eventually the folks from the hosting provider were overcome.

Finally, the hosting folks made a conference call to their own support hotline and asked one of the reps to talk to me directly. I repeated my question, while everyone in the room listened intently. The whole thing took about thirty seconds, and ended with the positively surreal phrase, "So let me just confirm that, our company account’s password is ‘12345′?"

 

 

 

More Gibbity

posted by luis

We’re on the move! Gibbity is coming out of beta within the next 48 hours and moving to its new home with a different provider.

I really believe that the gibbity concept has a lot of potential, and I’m willing to put my money where my mouth is, so to speak. With that in mind, I’m shelling out some more cash to give gibbity a server separate from my current dedicated machine (which is already running quite close to capacity … highfiber alone chugs through over 11gb of bandwidth per day). So making the move to a separate machine was the first step, in my mind.

The next step is, of course, pimping the site to as many people as I can. So if you receive an invitation email from me next week, please don’t be offended :)

And if you haven’t checked it out already, gibbity is viewable (at least until I start the server-transfer) right here. I’m actually pretty proud of how Frankenstein-ian the whole site is: each game page is assembled from data coming in from Amazon, MSN Search, Google Blogsearch, Yahoo Search, Technorati and Wikipedia, and I’m currently working on adding feeds from Del.icio.us and IceRocket as well (if I can just figure out how to kill off IceRocket’s foreign-language-infested results). Of course, the most important part of the site is still the user-generated popularity lists and clouds

Go Gibbity

posted by luis

The title gibbity has a 79.6% chance of being a bestselling title!

In other news, I love lulu.

Yahoo > Digg?

posted by luis

A tasty rumor, if I ever saw one:

… little birdies have informed me that Yahoo has an offer on the table to buy Digg for somewhere in the range of $35 million dollars.

Being acquired by one of the big 3 is pretty much every web entrepreneur’s dream. Not only does it make you obscenely wealthy, but it validates your website’s mission as well (which may not seem quite as important, but I imagine it must feel great to be able to say "I told you this was a good idea" to all the naysayers).

Reading about this rumor reminded me of an interesting piece over at Signal Vs Noise about the dangers of building a company with an acquisition as its end goal:

If you’re about to build anything, don’t build it to flip or you’re almost guaranteed to flop. Sure, you could win the Yahoo lottery, but the odds aren’t in your favor. If 9 out of every 10 new companies fail, I can’t imagine the minute percentage of successful acquisitions. 1 in 100? 1 in 1000? Worse?

Although I totally agree with their philosophy of building a company with growth in mind, I do believe that the chances of a business’ success is a far more complex discussion than a mere statistical lottery. Like I said above, getting acquired by a multinational company is validation, in more ways than one. It means that your business had a solid idea, a great implementation and the potential to grow even further with the proper resources. Either that or they bought you out in an effort to kill the competition, but that in itself just confirms that you were on to something big.

So it’s not so much a "random act of fate" as it is "survival of the fittest," because it’s the people that have the great ideas and work the hardest to bring it to fruition that are ultimately rewarded for their efforts.

General Motors’ Parade of Progress

posted by luis

GM_Parade_of_Progress.jpg
I saw a CD player in Megamall last year that looked exactly like this.

« Previous Entries