Related link: http://codecon.info
On day three of CodeCon 2.0, Andrew Loewenstern presented Khashmir, a distributed hash table library based on the Kademlia algorithm. The Kademlia algorithm follows in the footsteps of the Chord algorithm, but improves on its performance in a few areas. The basic idea behind distributed hash table (DHT) algorithms is that like a regular hash tables they can be used to store key-value pairs, but the nodes of the hash table may reside on different peer servers. This algorithm is perfectly suited for implementing distributed indexing schemes in P2P systems. The Khashmir library utilizes UDP via the AirHook library — using UDP allows the Kashmir library to enter into NATted networks, which is crucial for getting widespread acceptance of P2P systems. The API for Khashmir is pretty simple — it supports the findeNode(), valueForKey(), storeValueForKey(), and addContact() functions as its basic operations.
Deep Green was presented by Michael F. Korns. The Deep Green presentation stood out a bit from the other presentations, since it was a financial application with a long history and it makes money! Deep Green is a web application that was designed to allow a single person to run a mutual fund of up to $1Bln by themselves. The Deep Green back-end system includes three million machine learning agents (neural nets, baysian classifiers, nonlinear multivariate regression, etc) that continually analyze the current and historical stock market data to provide the end user with information on how to manage the mutual fund. The most confusing aspects of the presentation was the business arrangements surrounding Deep Green. Korns and Associates, the developers of Deep Green use the system to manage an internal mutual fund that has outperformed all other mutual funds. The proceeds from this internal fund are used to further develop Deep Green. Furthermore, the company Invest by Agent is tasked with bringing the Deep Green software to market, but currently has no customers. Michael’s presentation was the most enigmatic presentation at CodeCon, and I am still confused about how all the pieces fit together. But I must say that hacking on code to make money to continue hacking seems like an admirable model.
Roberto Bayardo presented YouServ. The main idea behind YouServ is to allow users to host web pages if they don’t have a web server that can be available 100% of the time. Running a web server is not an easy task for people who have transient net connections or are running firewalls or NATs. Thus the websites may not be up all the time which makes it difficult for users to view the web pages or for Google to index the pages. YouServ takes a P2P approach to provide an easy to use cooperative web server that works automatically behind firewalls or NATs, ensures secure content access, provides a single login for restricted access, and provides search capabilities. The system relies on a central server to act as the single login access point and as presence manager which keeps track of the nodes in the P2P network that are currently available to serve the content of the cooperative web server. The central server stores no content and only takes care of the lightweight tasks to coordinate the network. Each of the peers in the network does the heavy lifting work of serving out the content of the pages in the network, but never deal with user authentication. To provide search services, the main server caches site summaries and is the first step in providing the user with search services. Once the main server uses the cached site summaries to narrow down the list of P2P nodes which could contain matching content, the P2P nodes themselves are contacted to complete the searching service. YouServ is written in 100,000 lines of Java 1.3 code and is self contained except for its dependence on bind for DNS services.
The next presentation was by Rich Bodo on Bayonne, GNU’s VoIP project. Unfortunately this presentation was quite confused — Rich used two computers, two sets of slides and he would quickly jump between between these which made the presentation hard to follow. Then, his demonstration did not work and he attempted to recompile kernel modules while the audience was watching, leaving the audience dangling for minutes at a time. In the end I gathered that the system uses a three tiered model (though I’m still not clear exactly what the three tiers are), is written in C++ and has a web services interface. It also contains its own internal script language that can be used to script advanced telephony applications. He briefly showed a script that would implement a minimal voicemail system using Bayonne, but he was unable to get the script to run. If you want more concrete details, please check the website — I’m still confused.
Last, but not least, was Raph Levien’s excellent Advogato presentation. Advogato, lets open source developers rate (certify) one another to establish a reputation rating for each of its members. Its accomplishes this with its crafty trust metric that determines membership in a community. The input to his trust metric is a graph of certifications (person A certifies person B at this level, person B certifies person C at that level…) and assumes that the input data is not 100% clean. This means the input data could be subject to a malicious attack, spammers, stupid users and legitimate controversy in the certifications. In light of these adverse conditions, Advogato works without a central authority that ensures that everyone is playing fair and still produces good results. The trust metric is resistant to a variety of attacks but it is not perfect — it gives good results for most cases. Raph didn’t go into the details of how the algorithm works due to the higher math required to grok the algorithm. The trust metric employs network flow theory, eigenvectors, power law vectors and social network theory — enough math to scare off even most CodeCon geeks. Raph also discussed the success of Advogato and of the other prominent trust metric on the net: Google’s PageRank system. The PageRank system also employs a trust metric and has a latency of only 100ms for a dataset of over 3 billion nodes. Impressive to say the least.
And that wrapped up CodeCon 2.0. Compared to last year CodeCon has matured immensely and both Len Sassaman and Brahm Cohen have taken their lessons from CodeCon 1.0 to heart and delivered an excellent confernence. Congratulations to Len & Brahm and their small army of volunteers and thanks to Up Networks, Google, No Starch Press, and LinuxFund for sponsoring CodeCon 2.0.
CodeCon 2.0 — what did you think?



