Visit to UC Berkeley

Posted: August 7th, 2009 | Author: ryan | Filed under: Uncategorized | View Comments

I am writing this post from my temporary desk in the RAD Lab at UC Berkeley in the Computer Science Division. I’ll be spending some time here learning about Hadoop, a framework for distributed computing. Luckily for me, my long-time friend, Michael Armbrust is a fourth-year CS PhD student here working on distributed applications. There are a lot of people here doing research in large-scale distributed and parallel computing, including Michael.

Since third grade, Michael and I would spend countless hours in front of our computers. We wrote little applications in QBasic then graduated to Visual Basic, all while a small business venture in the background. Our largest project was called ‘System Assistant’. The System Assistant was a plugin-based system tray bound application. It’s main function was to load DLLs of individual self-contained helper applications. I can’t even remember how many plugins we wrote for it, but it was pretty awesome.

After Michael picked me up from the North Berkeley BART station we went the the CS graduate student social hour which included wine and cheese, bread and even smoked salmon. After that we sat down for a bit to talk about setting some goals for my trip before going out on the town.

We were finally able sit down in front of a keyboard, like old times, yesterday after dragging ourselves out of bed for a 9am conference call. Eventually coding ensued, and about 14 hours later still sitting in the same conference room since that morning we had managed to encapsulate my existing Python code to run it using Hadoop. Around 11pm that night we had run a Hadoop job against 10,000 PDS files totaling ~30GB. Processing took just barely over 15 minutes. The bottleneck turned out to be loading the data into HDFS, which itself took about an hour.

I’ve got a really good foundation now and can spend next week tweaking and polishing everything.



Leave a Reply

blog comments powered by Disqus