April 2005 Archives

Harold Davis

AddThis Social Bookmark Button

In a contextual sea change, Google has announced a beta program that allows publishers to embed AdSense contextual ads in RSS and Atom syndication feeds.

“This is gonna be huge… like HUGE HUGE HUGE!!!” gloats Jason Calacanis, the President of Weblogs, Inc., which - as you might expect - publishes a bunch of syndication feeds. As a publisher of syndication feeds in a small way, I suppose I ought also to be glad. But actually, I think monetizing RSS and Atom feeds in this way in part defeats the purpose of having a feed. Feeds are simply an information stream that point to further information. If they get cluttered, they cease to be useful, and subscribers will cancel. In some sense, the RSS or Atom feed is an advertisement for the full content in and of itself.

Case in point: it is against the policies of Hot Feeds and Syndication Viewer to display feeds that carry ads.

Here’s the way Weblog’s unofficial Apple feed, which Calacanis is using to test the syndication AdSense program, looks (with ads) in Syndication Viewer. Each Google AdSense ad is simply an HTML table embedded in feed items like this [identifying numbers and actual link omitted]:

The good news: it ought to be trivial to parse these ads out of incoming feeds, simply by eliminating table tags and their contents from item entries if in no other way. I will certainly do so in Syndication Viewer.

brian d foy

AddThis Social Bookmark Button

Related link: http://www.perl.com/pub/a/2005/04/28/bdfoy.html

O’Reilly editor chromatic interviews me about my magazine, The Perl Review.

I had almost forgotten about that since we did the interview back in January. Both of us were distracted with book projects for O’Reilly.

chromatic

AddThis Social Bookmark Button

Related link: http://www.bsdcan.org/2005/activity.php?id=54

Dan Langille tipped me off that BSDCan 2005 will host the premier of a new open source tool to perform opportunistic live mirroring of remote systems over the Internet. Anywhere you go, if you have an Internet connection, the tool will mirror changes to your filesystem to your backup server. It’s NetBSD only for now, but it sounds useful enough that there’ll soon be ports to other operating systems.

Derek Sivers

AddThis Social Bookmark Button

Related link: http://www.cdbaby.com/

CD Baby only has one warehouse today. But by the end of the year, we’ll have multiple warehouses.

I used to count current stock/inventory by using a simple database table called inventory that really only counted items_received, then subtracting the quantity of items sold from the order’s lineitems table.

But with multiple warehouses, this won’t work anymore! Each warehouse needs to know how much its own current stock is. Unless I were to actually tie an order’s lineitem with the warehouse_id it came from, I would need a new approach.

So - the new approach is this:

INVENTORY is a detailed historical trace of everything in and out of the warehouse.

STOCK is a current-status : how many of each item are in each warehouse right now. Used for quick lookups.

## THE DATABASE TABLES:
INVENTORY:
id | warehouse_id | item_id | quantity_in | quantity_out | created_at | person | shipment_id | notes

STOCK:
id | warehouse_id | item_id | stock_status_id | quantity

(SIDE NOTE: Stock_status_id is just something we can override by hand : if a musician tells us their CD won’t be arriving for 3 more months, we’ll set it to a DELAYED status. If it’s permanently out of stock, we’ll set it to PERMANENT status. Etc.)

Every time something adjusts the inventory, the stock table needs to be updated. So I decided to try my first PostgreSQL trigger. (See the PostgreSQL manual on PL/pgSQL).


CREATE OR REPLACE FUNCTION update_stock() RETURNS trigger AS '
DECLARE
w integer;
i integer;
instock integer;
BEGIN
IF TG_OP = ''DELETE'' THEN
w := OLD.warehouse_id;
i := OLD.item_id;
ELSE
w := NEW.warehouse_id;
i := NEW.item_id;
END IF;
SELECT INTO instock (COALESCE(SUM(quantity_in), 0) - COALESCE(SUM(quantity_out), 0)) FROM inventory WHERE item_id=i AND warehouse_id=w;
IF instock IS NULL THEN
DELETE FROM stock WHERE item_id=i AND warehouse_id=w;
ELSE
UPDATE stock SET quantity=instock WHERE item_id=i AND warehouse_id=w;
IF NOT FOUND THEN
INSERT INTO stock (warehouse_id, item_id, quantity) VALUES (w, i, instock);
END IF;
END IF;
RETURN NEW;
END;
' LANGUAGE plpgsql;

Everything before the first SELECT statement is just some basic setup stuff: you have to use the DECLARE section to first say what variables you’re going to be using below.
Then I had to start my BEGIN section by changing how I got the warehouse_id and item_id based on whether I had just done a DELETE on the inventory table (use OLD), or an INSERT/UPDATE (use NEW).
Then the basic stuff begins:
Subtract the total-out from the total-in for this item for this warehouse.
If it’s not found, then make sure stock table is not caching old data (DELETE).
Update the stock with the new current quantity.
Or if this is the first time, insert it.
That’s it!

Then, to have the database use this function automatically, I just added this to my database definition:

CREATE TRIGGER stock_update AFTER INSERT OR UPDATE OR DELETE ON inventory FOR EACH ROW EXECUTE PROCEDURE update_stock();

Now any time I make ANY adjustment to a line the INVENTORY table (insert, update, delete), the STOCK table is instantly updated for that item in that warehouse. It’s instant and wonderful.

Any improvements or suggestions are welcome

brian d foy

AddThis Social Bookmark Button

A friend and I were talking about blogs today, and started debating which has had more impact on journalism: cable news or blogs. We disagreed.

We weren’t talking about specific stories or events. We argued about the effect on new-gathering, ethics, and other journalistic concerns.

I won’t tell you who picked which side just yet. I want to hear what other people think. There’s a beer riding on this, so choose wisely. :)

Derek Sivers

AddThis Social Bookmark Button

I just got back from 10 days in Japan to check out the independent music scene there. Here, I ‘ll write up what I found in case it’s of use to anyone else.

I’ve been curious about Japan, because CD Baby’s biggest customers are there. Though we have only about 5000 customers in Japan today, they’ve spent over $1 million in CDs at CD Baby.

But we have very few musicians from Japan, probably because of language differences and the problem of mailing a box of CDs from Japan off to America. I had been wondering if it’d be wise to set up a local representative : a point-person there to be “CD Baby Japan” so that the musicians of Japan could talk to someone Japanese to ask questions, mail their box of CDs locally, have them shipped to fans directly, and also as a remote warehouse for our top-sellers in Japan.

I had been wondering so much about this that I decided to check out the price of visiting, and it was only $475 round trip airfare! Yes, the hotels and trains would be expensive, but all-in-all, worth a trip. So I booked it.

I emailed our top 20 customers (some of which had spent over $50,000). Most of them were resellers, running CD shops in Japan, so I thought they’d be perfect to ask for help getting to know the scene. (Soon you’ll see, though, that this might have been unwise, as it skewed the perspective.)

I’ll break down my interesting findings by subject. PLEASE NOTE that everything said here is NOT definitive fact, but just what I saw and heard from the dozen people I met with last week (April 2005). It wasn’t the most well-rounded perspective, so consider this to be just one point of view.

*** E-COMMERCE ***

There’s not much buying online. Though many people have credit cards (mostly ATM-style debit cards) they’re generally not trusted, and so the whole e-commerce scene in Japan now in 2005 feels like America in 1996. Early-adopters doing it, but the majority are suspicious or just not interested.

As recently as 15 years ago, the huge Tower Records in downtown Tokyo didn’t take credit cards. Now they do but it’s still only 10% of transactions. Cash is still 90%.

If people do buy something online or mail-order, it’s usually paid for C.O.D. (”collect on delivery”) - paying cash to the UPS-style delivery person. You can also pay this person by credit card, and some do : the ones that distrust entering their card number online, but don’t mind handing it to a trusted person to process.

Another popular payment form is the convenience store! Every local 7-Eleven (and all similar) convenience store has a full range of services including having your packages delivered there to be paid for and picked up in person.

Bank-transfers are also common, but it costs about $4 per transfer, so is mainly used for bigger transactions, not buying a CD.

PayPal has an unfortunate history here. Seems a competing bank bought the exclusive license, but then ran it to the ground so it wouldn’t compete with their own similar system. But since it only works with other customers of that same bank, it never took off. So the whole PayPal phenomenon seems dead in the water, here.

Cell phones are massively popular. Everyone has one. In fact, because of the sophistication of the phones (sending text-messages is very popular), and the disinterest in “web browser and a credit card” e-commerce of America, it seems the e-commerce style that will catch on in Japan will be direct-to-cellphone. I’m not talking about teens, by the way: I saw quite a few over-50 women and men spending their entire train ride typing on their cellphone : either sending messages or playing games. It is considered rude, though, to talk on your phone in indoor public places, and most people honor that. For as many people that have cellphones, I hardly ever saw someone talking on it in restaurants or trains.

It’s not so popular to have a home computer! Probably only half as many people have a computer as have a cellphone.

*** BUYING MUSIC ***

CDs cost $22 - $25 and are the main way to buy music in Japan.

There’s a regulated price-fixing for Japanese CDs: when a CD is created the price is actually named on the CD itself! The “2350 Yen” (or whatever) price is actually written right in that inner ring of data on every CD created in Japan. So if a CD is named as 2350 Yen, it is exactly 2350 Yen at Tower, 2350 Yen at HMV, and 2350 Yen at any little store on the corner. This price-fixing is seen as good for small shops, so they can fairly compete with the big giant shops.

Imported items, though, are free to be priced what at whatever they want. They usually sell for the same $22-$25 range, though.

People aren’t as price-sensitive, and there’s even a feeling that prices too low are just stupid & pointless, since they’re OK with paying more. Might even be a cultural bias so that if a CD is selling for $10-$15 it’s assumed there’s something wrong with it.

Apple iTunes hasn’t launched in Japan yet, though people say they’re planning to someday soon. The e-commerce differences make make it a tough sell, though.

There’s only one iTunes-style download music store, but it’s created by the majors and only puts its top hits for download. Some think they do it this way because though they know the digital-download future is inevitable, they’re trying to avoid it. Since people even spend $10 on a CD-single in Japan, so the idea of the $1 single is threatening.

Online streaming of music, even 30-second previews, are basically not allowed in Japan. Only label can license it, and labels don’t license it.

95% of the top-40 albums are Japanese : only a few like Mariah Carey, J-Lo, Avril Levigne make the top charts. Even U2 and Jack Johnson don’t.

The two biggest threats to the music industry are free downloading (like Kazaa), and CD-renting(!). You can *rent* a CD, take it home, copy it, and return it. CD-rental shops are very common (and legal).

By far, the most popular music to import from America are black artists. Old classic R&B Soul, gospel, modern rap, as long as it’s black. There’s a real fascination with black Americans. Because of this, the artist needs to be on the cover, or it won’t sell. (And as you’ll see below, it has to be a real professionally manufactured CD, not a CD-R.)

*** MUSICIAN SCENE ***

This was a harder one to figure out, since most of my meetings and introductions were with top CD Baby customers, they are importers more than exporters. They know a lot about bringing in American music to Japan, but not as much about Japanese musicians. (Often none at all!) I did meet with a few people on the music-making side, though. (Producer, promoter, musician.)

A strange thing came up many times, when I was asking about independent musicians in Japan: this feeling that anyone not on a record label is a bad amateur. Some said that the labels are so actively looking for anyone good, and signing them to small development deals, that it’s felt that as a musician in Japan, if you’re not on any record label you must not be any good.

There are quite a few distributors that will carry “almost any” album that anyone puts out. But most just list it in their catalog, and don’t actually put it in stores. Some stores (even one big chain: DiskUnion) will carry independent albums, even working directly with musicians, on consignment.

One person told me that this is how record deals work in Japan:
- label likes a band and signs them
- label assigns them a management company
- label pays a *salary* (!!) to the band, and to the management company
- management company gets a much larger percentage than the artist
- artists get almost nothing for music sold or shows
… therefore : a record deal is actually a very good deal for a band that does NOT sell, because you can end up with a few years salary, even without selling anything. But if you sell a lot, it’s a pretty bad deal, since your success is not rewarded as directly.

I didn’t hear any of the venomous hate of record labels, though, like we do in America.

A couple bands, recently, sold 100,000 CDs in Japan as an indie. This has excited and encouraged people (like Ani DiFranco in America).

When I asked about where the musicians gather, (popular musician magazines, websites, email lists, directories), they were stumped. Though there’s one annual directory (a la Indie Bible or Yellow Pages of Rock) - nothing else came up. I really kept pressing for this, and still got nothing. Maybe there aren’t any, or I might have just asked the wrong people.

The only thing that did come up when I was asking about the “musician scene” was one small neighborhood in Tokyo where all the little “indie rock” record stores are. They say all the musicians hang out there, too. Might just be a rocker-thing, though. (Like Sunset Strip was in the 80’s.)

*** CDs ***

The topic of CD-Rs kept coming up in all my conversations. Apparently it’s a real issue, in Japan.

CD-Rs are seen as *dangerous* - that they might BREAK your CD player!! I thought this must be total misunderstanding or myth, but found that there’s some truth to it: that the oldest, earliest CD players, when trying to play CD-Rs, would malfunction, and sometimes never work again. Very strange.

Because of this, though, CD players are now marketed with “CD-R compatible” and people are actually aware of this.

This came up a lot, because when talking to the importers/resellers, they were SO upset whenever they buy something from CD Baby and discover it’s a CD-R. It means they can’t sell it.

All this being said, someone said he suspected the whole CD-R fear was a conspiracy manufacured by the labels who felt that a CD-R revolution would damage their $25-per-CD business.

CD-Rs are given away by musicians for free at their shows. Only printed and manufactured audio CDs are seen as “real”.

It’s just as cheap to manufacture 500 CDs as 1000 CDs. Most do even less, without penalty. In fact once you do press up 500-1000 CDs with a manufacturer, you’re free to order re-runs as small as even 100 copies, for the exact same price-per-disc as you paid for the initial 1000-CD run.

*** - ***

In America, the music scene is all a-buzz about Apple iTunes, Rhapsody/Napster subscriptions, and the digital future of music. Because it’s so covered in the press, I know many people think there’s NO scene for CDs anymore, that it’s ALL digital.

It really surprised me that Japan, who most people consider to be the most technological-advanced nation, has a music scene that is almost entirely based around paying $25 per CD in cash to physical stores. That the idea of buying music online has been decidedly shunned.

Does the music scene in Japan need breaking and replacing? Even to ask feels arrogant.
Perhaps this music revolution that everyone is talking about only applies to America?
Maybe Japan will skip this whole transition stage we’re in and leapfrog into something much more advanced?

I didn’t go there to make any decisions. Just to listen, look, and learn. So I won’t end this with any conclusion.

Andy Oram

AddThis Social Bookmark Button

Related link: http://www.onpointradio.org/shows/2005/04/20050426_a_main.asp

The NPR radio show
On Point
had an in-depth discussion this evening of the lagging adoption of broadband in the U.S., which is certainly increasing but not at a rate matching advanced Asian economies. One caller raised a formalistic and rigid version of standard free-market economic arguments: if there is slow growth in broadband, it must be because there aren’t that many people who want it. Where, he asked, is the demand?

In a situation like this where oligopolies in the local loop use political and market muscle to hold back competition, one has to look for other signs of the need. For instance, the rural areas of this country are emptying out. Even many cities are doing poorly as population piles up in a few megalopoli, particularly along the coasts.

This has all kinds of negative social ramifications: a crisis in affordable housing, increasing ecological damage and traffic snarls, exposure to flooding, and so on.

Basically, people are leaving the rural areas and the middle of the country because they can’t get jobs. They also find themselves disadvantaged when it comes to educational opportunities and other amenities. High-speed Internet access, with opportunities for telecommuting, distance education, medical videoconferencing, and other modern applications, can help restore a healthy balance to the country.

In short, demand is masked by flight.

The show was quite valuable in its discussion of the suppression of competition in last-mile access. The baby Bells squashed the hundreds of small Internet providers that tried to get a foothold in local markets in the 1990s and then told the FCC (with the desired results) that competition would be aided by having less competition–that is, that the FCC should let the Bells and cable companies duke it out without harrassment from small innovators.

Now, as mentioned on the radio show, the telecom companies and cable companies are using the same argument to hold back municipal networks: supposedly, holding back competition is good for competition. The irony is that municipalities step in to take on the big job of building out a network only when the private companies have stayed away. And a government-run fiber network can lay the groundwork for competition at higher layers.

Let’s have some real competition, and then the hidden demand will reveal itself.

Uche Ogbuji

AddThis Social Bookmark Button

Related link: http://xmldb-org.sourceforge.net/xupdate/

XUpdate is a product of the XML:DB group, but it is designed for use far beyond XML databases. It’s a lot like XSLT, being an XML-based host language for XPath expressions, but XUpdate instructions are tailored for update tasks. Doing the equivalent in XSLT is rather clumsy, which is to be expected since XUpdate is specialized for the purpose. I’m a big fan of XSLT, as my body of work amply proves, but I prefer XUpdate for what it does best.

XUpdate has always been a shoestring community standard, and has always been skeletal, and somewhat incomplete. Despite this fact, it has over a dozen implementations (half of which can be used independently of any XML databases). This relative success is because it is simple to understand and simple to implement. XUpdate is a poster child for worse-is-better.

There is a low-volume mailing list, which used to be riddled with spam, but has since been brought under control by Per Nyfelt. I think a lot of XUpdate discussion has been directed more towards implementation mailing lists rather than the general spec list. Certainly, we get a lot of questions about 4Suite’s XUpdate implementation. There is also a very useful use cases document by Kimbro Staken. The main problem with this document is that it includes some controversial use cases that are not included in the actual spec. This is one of the things that will, I hope, be cleared up soon. I also hope the many links to outdated XUpdate spec locations will be fixed.

As a meter of XUpdate’s continuing relevance, and as a service to interested users, I’ve compiled a list of projects that include separate XUpdate implementations.

  • XML:DB XUpdate–reference XUpdate implementation in Java
  • 4Suite–XUpdate available via Python API, command line or XML repository
  • xmldiff–Python tool to generate XUpdate “diffs” between XML files
  • RxUpdate–An “enhanced” XUpdate implementation in Python, including RDF support
  • Apache Xindice–XML DBMS in Java. “At the present time Xindice uses XPath for its query language and XML:DB XUpdate for its update language.”
  • eXist–XML DBMS in Java.
  • Ozone–XML DBMS in Java.
  • X-Hive/DB–XML DBMS in Java. See the XUpdate page
  • dbXML–XML DBMS in Java. “XUpdate is also a transformation with some of the same goals as XSLT, but its syntax is simpler, and its purpose is to modify the content of documents in place.”
  • Orbeon PresentationServer–full-blown XML platform thingy, in Java. See docs on the XUpdate engine and lower-level processor. The latter link includes a useful intro to XUpdate.
  • Jaxup–”A Java XML Update engine”
  • Mobius Mako Command Line Utilities–Mobius is a Grid technology project. Mako is “a service that exposes and abstracts data resources as XML”, supporting XUpdate.
  • Montag–”a Java Web Services based system for the interaction with every Native XML Database that supplies a Java implementation of the XML:DB API.” Includes XUpdate support.
  • XML-XUpdate-LibXML–Perl implementation

If you know of any I haven’t mentioned, please point them out in comments. Overall, I hope you at least have a look at XUpdate for relevant tasks, and better yet, become involved. It has about as low a barrier to entry one could possibly imagine.

Do you use XUpdate? Or do you really not accept its relevance?

Harold Davis

AddThis Social Bookmark Button

Related link: http://services.google.com/ads_inquiry/sitetarget?hl=en

In a further departure from its roots in searching, Google has announced a new program that will allow advertisers to choose sites for target ads.

I’ve written in the past about Google’s transformation (at least looking at revenue) from a search company to an advertising broker. But contextual advertising - Google’s other-than-search bread-and-butter - still involves technology that automatically caluclates relevancy, just like a searching algorithm, and produces a marketplace for words. Whether the context is evaluated correctly or not by the automated mechanism is another story (analagous to questions of how well searching works).

In the new Google order of things, advertisers interested in branding can pick their sites without regard for contextual relevancy. The New York Times bills the changes as a move away from search for Google, and Google-commentator Brad Hill in his blog calls the move “industry shaking.”

Advertisers will pay for the new-style ads on a CPM basis, or per ad impression (not per ad click as with contextual ads), although the process of purchasing these ads will be blended with the traditional Google CPC (pay per click) word auction process.

These ads are intended to appeal to big advertisers who are looking for general branding (for example, all kinds of advertisers of luxury goods would probably like to appear on BMW’s site, even if the ads were not contextually relevant to cars).

Context-free ads may also work for advertisers who are better able to determine relevance than the automated algorithms - it makes sense to put ads for cheese on a oenophile site, but AdSense probably doesn’t think so. Google’s revenue stream will be a winner, as will big advertisers and owners of desirable Web content. Possible losers: anybody but Google in the business of brokering ads.

Ming Chow

AddThis Social Bookmark Button

Related link: http://www.cs.tufts.edu/~mchow/excollege

Hard to believe, I am almost finished with teaching a full college course (one semester) –my course at Tufts University entitled “Security, Privacy, and Politics in the Computer Age,” offered by the Experimental College. It has certainly been an exhilerating few months, but it has been a very rewarding, memorable, and flattering experience.

So what did I learn from teaching computer security, politics, and privacy to a group of twenty, mainly non-technical, college students? Here are some of my thoughts in a nutshell:

  • It is difficult to balance technical and non-technical information. Many students know what spyware and computer viruses are, but the technical workings of them are complicated. If you delve into complexities such as the operating system or the kernel, the students will be lost. I also recall making my cryptography lecture too simplistic, and I saw many students fall asleep.
  • Students are dependent on reactive tools including firewalls and anti-virus software. Such tools have been well-marketed, but they can only do so much. That is, the “bigger point” is missed –numerous security holes in software are unpublicized, which leads to one massive hole. The message that I sent to the class was clear: the first line of defense is to protect yourself and your systems (be proactive as possible). Funny, I still receive assignments that mention relying on firewalls and anti-virus software to protect their systems.
  • Few have knowledge about open source software, and alternatives to popular software packages. It is important to discuss the software life-cycle development process early in the semester because it will provide students insights on where a lot of the problems come from. One of the first comments from students that stuck me was that many have never heard of open source software, nor have they heard of alternatives to popular software packages such as GIMP, GAIM, and yes, even Firefox. As much as the technical community read and speak about OSS, the general public still don’t understand it.
  • Few have used Unix or Linux. Unix and Linux are sometimes dubbed as the “the most important operating systems you may never use,” and I found this quite true. That is why I distributed free copies of Knoppix to students, and used it for my lectures on occasion.
  • News and information evolve and change frequently. Several weeks after I gave a demonstration on password cracking, the news of Paris Hilton’s sidekick cracked via simple password broke out. We had to reflect back on our previous lecture. Same issue with the recent slew of consumer database breaches. The instructor (myself) have to keep up with current events especially when teaching such a course.
  • Students enjoy examples. Students love screenshots and hands-on examples from the terminal.
  • Instructor has to encourage feedback and dialog. Maybe it is because of the college environment, most of us have been there, done that. I found that students walk into class with very little expectation or motivation each day. They just want to go to class and leave, and probably forget the information. It is the instructor’s job to incorporate debate and dialog in the course. You just can’t hope that all students will be active. I had two debates and two expert panel sessions in the class, and they have been most engaging (as said by the students). Same goes for the discussions on copyrights, electronic voting, and P2P technologies –no surprise considering the topics are controversial and debateable.
  • Need a hands-on assignment to show how hard security is. Security is hard, we know that. But talk can only do so much. Recently, I gave a two-part group project on designing a fictitious state lottery game and its secure system. Not only did the students find that designing a system is difficult and time-consuming, but also how hard it is the accomodate for everything there is. I had to use so much red ink on grading the design projects, both phases (the game design and the system design)

These are just some highlights of what I learned in my very first teaching experience. After I submit the course grades, I will sit down and collect all my thoughts about the course. Would I want to do this again? Absolutely, in a heartbeat.

Schuyler Erle

AddThis Social Bookmark Button

Related link: http://www.spacedaily.com/news/spacetravel-05v.html

This amazing, detailed article on Space Daily shows how NASA’s policy, before and after the manned lunar landings, has gone from brilliant to bumbling. The message is that NASA critically needs to re-engage the mission-oriented planning and direction of the Apollo era, if humanity is ever going to get back to the Moon, and then to Mars, and then back, in one piece. (Thanks, Anselm!)

Harold Davis

AddThis Social Bookmark Button

Statistically Improbable Phrases (a/k/a “SIP”) is the improbable term Amazon.com uses as a search ranking technique. Here’s Amazon’s explanation.

In more-or-less plain English, here’s how this works. Amazon indexes the “Search Inside” content of the books in its catalog (that is, the books in which publishers provide this content). In many cases, Amazon provides a list of SIPs on the main listing page for the title. For example, Starting an Online Business for Dummies by Greg Holden has a number of linked SIPs listed, including “your online business.” These SIPs are phrases that appear with anomalous frequency in the inside content of the cataloged book compared with the entire the rate of occurence of the SIP in the universe of books in general. This statistic over-occurence implies that the SIP is a significant representation of the content of the book.

By clicking one of the SIP links, you get other books in which the SIP occurs, sorted from most to least by the number of SIP references. For example, “Web Analytics” and “E-Commerce for Dummies” have the next highest occurences of the SIP “your online business” after “Starting an Online Business for Dummies.”

This is a different and somewhat appealing way to use Amazon’s search facilities to find books in which the author uses distinctive phrases. Longer run, the concept has an elegant simplicity (as did the original PageRank algorithm), and may be useful for automated tagging and ranking of content.

Click here for a lively discussion of SIPs in the context of author as phrase maker, and here’s a fun discussion and list of adult SIPs on Amazon (over 18 only please click this link).

Tony Stubblebine

AddThis Social Bookmark Button

Every week the developers of the O’Reilly Network (that’s me, three
developers, and two admins) have a status meeting to check in with our key
managers, decide or rearrange priorities, and work through problems. This is a
dream meeting for managers, questions are answered and plans are laid.

It’s fair to say that status meetings aren’t a developer’s dream. After
several years of weekly meetings, ours were feeling stale. So we agreed to end
the meetings with a round of tips and tricks. First up, bash tricks.

At the end of our meeting, the managers bailed and we stuck around to geek out
over the tricks that make our work easier. Everyone had something to
contribute. I can’t recommend this enough! I’ve learned a lot from books like
Unix Power Tools. But by
sharing directly with your coworkers you get advice that’s targeted directly
to the work you do.

Here were our gems, the most useful tips that weren’t already common
knowledge among the developers.

pushd/popd

Bash will keep a history of the directories you visit, you just have to ask.
Bash stores the history in a stack and uses the commands pushd
and popd to manage the stack.

pushd dir - move the current directory onto the stack and change to
the dir directory.

popd - pops the top directory off of the stack and moves you into it.

We’re opening files all over the file system, internal code, vendor code, templates, configuration files, logs. Because of this we like the ability to take a detour on the file system and still navigate back to our working directory of the day. I think these commands are so useful that I alias’d them in my .bashrc


alias cd="pushd"
alias bd="popd"

Now the cd command manages the stack for me as well as changing directories. Aliasing popd to bd is an easy to remember and easy to type way to move back up the stack, think “change dir” and “back dir”.

History

Bash keeps a history of the commands you’ve run. My group was already comfortable with the up and down arrows to navigate the history, !! to repeat the last command, and !foo to repeat the last command starting with foo.

Our newest admin had a better way, CTRL-R. That does command auto completion. Repeatedly pressing CTRL-R lets you tab through matching commands.

Home/End

CTRL-A takes you to the beginning of the line and CTRL-E takes you to the end of the line. This is probably basic shell knowledge, but I’m probably (hopefully) not the only person who didn’t know it.

For Loops

We’ve got a cluster of machines that we’ll sometimes need to loop through. Here’s an example from our admins that checks uptime across our cluster.

$ for s in `cat server.list`; do ssh $s uptime; done;

Working with the Previous Command

Sometimes you want to run several commands on the same file, like run ls before deciding if that’s the file you want to edit.


ls -l /long/path/to/file.txt
vi /long/path/to/file.txt

Bash provides a shortcut (!$) that holds the last word from the previous command. So in the above you could just write vi !$.

If your last command had a typo you can fix the command and rerun it with this construct, ^foo^bar. That replaces the first occurrence of foo in your command with bar.

Bonus Tip: use Awk

Our admins seem to think awk is pretty useful. And my boss thought it was
so useful, he wrote a book
on it. I can’t keep any of the awk syntax in my head beyond printing out a
column from a file.

The normal column delimiter is whitespace. So if you wanted to print out the seventh column in an Apache access log (that’s the request url in my logs) you could write:

cat access_log | awk '{print $7}'

You can change the delimiter with -F. So if you wanted to list all the users on your system, you could pull them out of /etc/passwd with:

$ cat /etc/passwd | awk -F: '{print $1}'

The /etc/passwd delimiter is :, which I’ve indicated
to awk with -F:.

We’re doing vim tricks next.

What’s your favorite bash trick? How else do you share tricks among developers?

Harold Davis

AddThis Social Bookmark Button

Related link: http://www.google.com/searchhistory/

Google has a new feature that tracks your search history. (Click the link to open the sign-up page for the application, which otherwise can be accessed through Google Labs.) This is another one of Google’s wonderful tools that is a “beta” that is not really a beta.

So far, the functionality is pretty straightforward and (at least for me) very useful. When you are logged in, and you can log in of course from any computer, Google keeps track of your searches. You can click on any of the links that represent a saved search to see the full text of a search. You can also retrieve searches by date using the calendar that the Search History Tool provides.

Once you sign up for the Search History Tool, your Google home page changes. Up on the right-hand top, you’ll see your sign-in email, a link that takes you to your account history (which is where to find the calendar and search links, and also the ability to remove any or all search items), a link that takes you to your Google account settings, and a link to sign out. If you do sign out, Google’s home page will show you a sign-in link.

Keeping track of my search history is a very useful feature for me. I can’t tell you how many Google searches I do a day (probably in the three or four digits), although the Seach History Tool will in fact tell me this. Many times, I’ve “lost” information from a search that I thought I didn’t need (but actually did!) The Search History Tool will pretty much solve this problem for me, I think.

Down the road, the Search History Tool will probably let Google refine searches for me based on my search history (it remains to be seen how helpful this is).

The Search History Tool may allow customization that is an important weapon in the battle against search spam, because I may be able to “train” my future searches by deploying a “Junk” setting against my Search History results. Other forms of search customization, once I’m logged in to search, are also possible of course.

I also see the Search History Tool as a Trojan horse for the introduction of more Yahoo-like services. Google needs to know its users better to create these services: and what better way to know someone than to keep track of their searches?

Andy Oram

AddThis Social Bookmark Button

Related link: http://www.mysqluc.com

On the last day of the

2005 MySQL conference
,
I finally heard a speaker who stretched the audience’s assumptions and
pointed toward a liberating path forward. This is the sign of a good
conference, incidentally–most of the sessions deal intensively with
the problems of today, but one or two keynotes prepare the listeners
for tomorrow.

I wrote in

my earlier weblog about this conference

that MySQL was becoming conventional. Many people are doing innovative
things with it–I sat in today, for instance, on a session about MySQL
as an embedded server or library–but the largest attendance has been
reserved for traditional topics such as replication and performance
tuning. MySQL AB itself is concerned with catching up to its
competitors in terms of SQL features that centralize more and more
control in the database engine.

Adam Bosworth, in his keynote today, threw all that out and set his
ship headed in a different direction. The problem he found with
centralizing processing–with stored procedures and triggers and so
forth–is that it doesn’t scale. His talk also implied that it
restricts users from making innovative connections. Google, his most
recent landing place during Bosworth’s long and impressive career,
illustrates an entirely different way to handle data.

Adam Bosworth’s view of an open data query protocol

The promise of the Web was to aggregate the contributions of
individuals everywhere and make retrieval easy along any lines one
chose to use. As the volume of content became unmanageable, XQuery was
supposed to provide a Web-aware search mechanism, and Web Services the
infrastructure and protocols to connect sites. XQuery and Web Services
were too big and came too late, however. Nobody actually wants to use
them, even if they know how.

So the gap has been filled with RSS, the model highlighted by Bosworth
for the next stage in search. RSS and Atom are lightweight and easy to
understand. The put control in the hands of the content providers and
the potential viewers.

Bosworth’s extended vision is for a protocol that provides raw access
to data, somewhat as XQuery is supposed to do. It would be a very
simple and database-independent protocol that would make all data in
the world open. Then, he says, everybody could do what Google
does. And more–we could provide distributed updates too.

Where to impose structure

The Google approach to data, carried through in Bosworth’s vision,
runs head-on up against the ideals of the relational database model.
The entire relational approach, from the canon of Third Normal Form
(three is a holy number) to the enormously complex collection of
analytic functions, subqueries, and other ways to impose structure in
SQL, is an attempt to be as precise as possible about the data chosen
and returned.

Bosworth isn’t interested in that. If the user gets a few hundred
results and has to scroll through them a little bit, that’s fine. We
don’t need no stinkin’ metadata or knowledge management.

The philosophical debate underlying relational database design

Bosworth evoked earlier debates that I’ve found valuable and aired
several concerns of mine; his views of the XML specs and RSS/Atom are
familiar. But his brief critique of the trend toward putting more and
more features into the database engine–a critique that he whisked
through on the way to grander visions–left open a question about the
basic philosophy of SQL.

When MySQL was bare-bones and lightweight (which it still is compared
to commercial database management systems or PostgreSQL), it put
responsibility in the hands of the application programmer. If a value
was supposed to be limited to a particular range or two columns were
supposed to be entered in tandem, it was the application programmer
that made sure of it.

In contrast, traditional database design takes as much control away
from the application as possible and puts it in the database. A
constraint or trigger or stored procedure or foreign key can make sure
that no one gives someone an absurdly high salary or fires an employee
while leaving his phone number in the database.

This centralized control is a relic of the 1970s, when corporate staff
would sit at command-line processors and type in SQL to do what they
wanted. Nowadays, when an application and even a Web interface stand
between the user and the database engine, the never-trust-the-user
philosophy is less valid. At the very least, an application has to
know the rules the database is enforcing and translate error messages
into something the user can understand. The wall between application
and database engine is porous, so the application can take on more of
the validation and logic.

But both philosophies are valid, and now MySQL offers a choice. I
suggested to Arjen Lentz, the organizer of this year’s conference,
that he offer a debate next year between the application-aware
philosophy and the database-aware philosophy–when is each
appropriate?

Most of us still need to find that phone number for an employee and do
other everyday tasks; we’ll be using a relational database for that,
and MySQL will be providing that service for more and more sites. The
people with day jobs who came this year to find out whether MySQL
could bring home the bacon got their answers. But MySQL can also
support fun applications, and I hope to see more coolness next year.

brian d foy

AddThis Social Bookmark Button

I went to a taping of The Daily Show yesterday. In the small studio, I saw a blueberry iMac at the sound station, an orange iMac behind the scenes (literally!), and what I think was the top of a G4 Tower.

I saw a bunch of other monitors hooked up to various hardware things, but I couldn’t tell what they were running.

No big whoop.

Jono Bacon

AddThis Social Bookmark Button

In recent years, the desktop, be it commercial or not, has evolved into a network aware base for running applications. Although the desktop has been through an extensive round of spit-sheening, largely driven by orange-sunglasses wearing usability engineers, the desktop is still largely un-integrated. Sure, you can take content from one program and embed it another program; a kind of glorified cut and paste, but the applications still don’t integrate together in ways that really benefit the user.

To understand how to design for proper integration, you need to first explore what people actually use their computers for. Aside from recreational use, the majority of businesses users, and those who actually work on their computers all utilise them within the concept of a project. Within this context, you find users who mentally hook together different applications with the intention of satisfying criteria to achieve a project or goal. This can be demonstrated with a simple use case.

Imagine that John Smith works as a consultant. John needs to interact with a variety of software tools:

  • Customer Relationship Management (CRM) system
  • Email client
  • Private and public shared calendar
  • PDA that is synced to the calendar
  • Website
  • Office suite

This is the way that John will typically bring these tools together when working on a project:

  1. First John speaks to the client on the phone and then manually logs the client and call in the CRM and logs a meeting on his PDA. He then syncs his PDA to his calendar in Evolution. He also adds the client contact details to the PDA and syncs those with the main address database in the office.
  2. Next, John needs to fill in some forms for the paperwork. He gets the CRM details for the client and creates a document in OpenOffice.org. He meticulously copies the data over that he stored in the CRM and prints it out.
  3. To prepare for the meeting, John wants to meet with his colleagues to discuss some aspects of the job. He discusses some things with one colleague over the water dispenser, but another is working from home. He cannot see him in Gaim, so he sends him an email and they have a lengthy (in time) email conversation.
  4. John now cuts and pastes the discussion from the emails into the CRM to better prepare for the meeting.
  5. John tries to contact the client to confirm the meeting but cannot get through. The client calls up on John’s office VoIP phone and the phone emails him the answermachine message. John listens to the message and copies the details into the CRM manually.
  6. The meeting happens and John makes notes on his PDA which are then copied manually into the CRM as well as some other emails with the client.

This use case is a typical example of a number of different systems working together on the same project but not actually integrating. The result of this lack of integration is that there is a lot of manually copying and pasting between different systems, particularly into the CRM; a system that most people find difficult to keep updated.

In reality, the above use case is not actually realistic. In our busy working lives, it is difficult and practically impossible to make use of all of these systems and keep them up to data manually. Each of these systems relies on John remembering to update them with the right information, and as anyone who works in a busy office will know, it is far too easy to get sidetracked by fellow staff members, other projects, websites, other emails, other IM conversations and lets not forget the surprise visitors who want to bend your ear for an hour or so.

There is clearly a problem here. Most of us will use a variety of software tools in one way or another to achieve a combined goal, but these tools cannot talk to each other effectively - they cannot integrate, and this wastes both time and effort. The source of the problem is that the desktop integrates together at a software level, but not a task level. Integrating these applications does not just mean sharing data between them, but it also means pro-actively adjusting the user experience of these tools in favor of a project.

The solution

The solution to many of these problems is to adjust the desktop experience so that it supports Projects at the base level. This would involve creating a simple to use Project Manager tool and including support in a range of other applications to automatically update and integrate with this tool. You can think of the Project Manager tool as a means to simply help different tools to talk to each other and to provide a central place in which the Project is managed. This Project Manager tool would also provide a clear summary of the project, who is working on what and provide access to all files relevant to the project; all of this being useful for reporting an auditing procedures.

Imagine this case study for John’s situation:

  1. John speaks to the client on the phone about the work that he client wants doing.
  2. John now logs into the Project tool on his computer and creates a new project. He adds some details about the project in the Meeting box he selects the date for the first meeting. When he saves the project, the software will update Evolution with the meeting which in turns syncs with his PDA. The information about the project is also added to the CRM and the call and meeting are added as CRM activities. The software also generates an OpenOffice.org document with the relevant information added and adds the document to the project file store.
  3. John now decides that he is going to work on the project. In the top right hand side of his GNOME desktop he clicks on the project icon and a drop down list of his current projects are displayed. He selects the right project and now his desktop has been adjusted to reflect those projects - the right bookmarks are loaded in Firefox, the right contacts (and only the right contacts) appear in Gaim, emails from the right people appear in Evolution and Nautilus is familiar with the files involved in the project, accessible from the My Project Files icon on the desktop.
  4. John clicks on My Project Files and clicks on the OpenOffice.org document to update some of the details. When he saves the file, this action is stored in the CRM and the updates are possibly mailed to his colleagues automatically with a list of the changes.
  5. John checks his mail in Evolution and sees that he has received an email from the client. When the mail was received by Evolution, it automatically noted it in the CRM and a message pops up hovering over the project icon informing that a new project related email has been received.
  6. John decides to send a new email and selects on the people involved in the project from his now more limited address book (the address book is more limited when he selects to the project to find contacts quicker).
  7. As John is working on some parts of the project, he sees interesting web pages in Firefox that he bookmarks in the project. As he works, the client pops up in Gaim. Gaim has automatically adjusted itself to only include buddies within the current project so John does not get distracted and ignore Gaim totally. He has the conversation and then the log of the conversation is added to the CRM automatically.
  8. John goes to the meeting, makes notes on his PDA and get back and syncs the PDA. The notes are automatically added to the CRM and if a second meeting was arranged, Evolution would be updated with new time and the CRM would be notified of the new meeting as well as any colleagues that are required (Evolution would mail them).
  9. At the end of the week John is working on his timesheets and most of them have been automatically filled in by the Project Manager tool with detailed times of then he was working on all these parts of the project. He only needs to fill in the gaps.
  10. As John finishes the project, he is asked by his boss for report. He clicks a single button in the Project Manager and a PDF report is generated automatically for him with a detailed breakdown of what he has worked on, how much staff labor was involved, the labor costs (calculated by referencing the time spent on the project and John’s average hourly wage), the goals achieved, the steps involved and more. A few other colleagues have asked to see the report so John can add them to the report mail-out or generate an online report that is automatically uploaded and then mails interested parties of the report.

With this use case, you can see how much of the leg work is automated by the systems. This case specifically improves on the old one in that the logging of each event in the CRM is automated by the software and John does not need to remember to log in and make the updates himself. In addition to this, each application within his desktop is adjusted to reflect the resources that are part of the current project. This is particularly useful when it comes to communication.

Better communication with the integrated desktop

One of the problems with communications tools is that they are notorious for sidetracking you. Possibly the largest offender is an IM client such as Gaim. When you log on to IM, it is likely that you are looking to speak to someone in particular, or you may be specifically interested in speaking to a particular group of people. As an example, I have a number of friends who work in IT and a number of friends who are in bands. When I am mentally in work mode and I am working on a project, I often log onto IM to ask a particular question to an IT buddy. Typically when I log on, one of the music buddies will pop up to chat to me and I feel guilty if I just ignore them. I have a short conversation and typically get sidetracked by the discussion. This not only wastes time, but it also affects my concentration. The result of this is that I tend to leave IM switched off unless I am specifically looking to be distracted or speak to non-work friends. It seems such a shame to waste an entire medium of communication just because of distractions from the wrong people I seek to use the tool to communicate with. Oh, and switching your status to busy does not alleviate this problem…

In an ideal world, Gaim (or any other IM client) will check the contacts in the current project and only advertise my online status to them and their online status to me. This will restrict IM to a tool that is useful for the project in hand, and my contacts are likely to talk to me on matters that are relevant. Some of you may think, well just organise your buddies into groups and select the right group. The problem with this is the same old issue affecting the current desktop - I have to make the effort to adjust it. No. The tool should make the effort to adjust it. I see no reason why I should indicate to five or six tools that I am working on the same project. I should indicate my desire to work on the project once and then the Project Manager updates everything else.

Can it happen?

Now, all of this is being discussed in a perfect world where this can all be coded and works effectively. Can it actually happen? Yes, I do believe it can.

I am specifically interested in making all of this work in GNOME. There are a few reasons for this. Firstly, the GNOME project have a fairly strong integration and control over a number of different applications. I don’t mean this so much from a technical perspective, but more from a social perspective. If you read Planet GNOME or keep up with the GNOME websites, it seems that the GNOME developers seem far more in touch with each other and better integrated. This is largely due to the existence of a lot of GNOME hackers in companies such as Red Hat and Novell, but also the fact that the GNOME hackers seem to discuss and actually produce software efficiently in tight groups. Some may see this as an old-boys-club, but I think it is just good hackers working with other good hackers.

Another reason why this can happen in GNOME is because you can only make this work at an architectural level and not at an application level. If the Gaim developers buy into the idea but other tools don’t, nothing will happen. To really make this work, the foundation GNOME architecture will need to incorporate a means to talk to other project-aware applications in different ways. This will require data-aware widgets becoming project-aware, the GNOME file picker including the project in the GNOME VFS, nautilus remembering file location histories for the project, recent-files menus remembering recent project files, evolution-data-server supporting sharing of contacts all over the place and more.

A final reason why I think it can happen with GNOME is that I believe the GNOME developers are real innovators. Some of the work going on in GNOME has been quite ground breaking with work such as Beagle, Project Utopia, Sabayon, Tomboy, custom widgets in F-Spot, file picker improvements and the GNOME VFS, luminosity eye-candy, support for SVG and more. In addition to this, the GNOME project have really hooked into some of the freedesktop.org technologies such as HAL, DBUS, XGL and more. Another point is that I was quite pleased to see the GNOME developers had the balls to bravely include the spatial nautilus in GNOME. A decision that caused much controversy but also spoke legions in terms of dedication to usability.

I think that that reasonably, the technology is available to make much of this happen. A lot of the challenges can be solved with DBUS, but I also think that a deep integration in GNOME of many of these project driven principles can work. Even then, the concept of the project does not need to be the only limiting factor. In addition to a Project, you may want to incorporate other modes for integrating applications together. I am not sure what they are right now, but I am sure they can be discussed extensively on mailing lists in further detail.

I should clarify that I am neither a usability engineer or a GNOME developer, but I have a strong interest in both usability and GNOME development. I am a consultant who works every day to help businesses, charities, schools and consumers get the most out of Open Source. The problems that I mention in this article about the lack of integration are real, tangible and measurable problems that I myself, my colleagues and my clients face every day. Duplication of effort typically results in no effort with some systems, and then this effort must all be reproduced in bulk when a project is tied up. With all the approximate applications in place (email, IM, productivity, CRM etc.), there is a real potential to make them work together more effectively and provide a truly innovative reason to move to the Linux desktop. There are many other factors to discuss in the design of this integrated desktop, and I am certainly not the only person to discuss this idea, but hopefully this article can contribute to the design and the discussion.

Do you think it can work? How can it be better improved? Write your views below…

Andy Lester

AddThis Social Bookmark Button

The new O’Reilly Radar site is pretty cool.
Tim O’Reilly,
Nat Torkington,
Rael Dornfest
and
Marc Hedlund
all post links and articles about cool stuff that’s on O’Reilly Media’s radar. There are charts and stats of what’s going on out in the forefront of tech.


It wasn’t until today that I got the play on words.


You have to type in radar.oreilly.com.


I’m surprised that they didn’t start off with a link to a Google Maps shot of Ottumwa, Iowa.

chromatic

AddThis Social Bookmark Button

Related link: http://www.onlamp.com/pub/a/onlamp/2005/03/31/extreme_admin.html

Andrew Cowie (who generously wrote the article linked above) expanded his theory of dealing with complexity in operations in a talk yesterday. Here are a few of his ideas that stuck in my head.

  • Programmers communicate through code. Operations people communicate through their procedures and documentation.

    If you take the idea that programming makes blueprints and compiling and deploying builds houses, the equivalent in the administrative world to programming is the design of systems and processes. (This paragraph is my idea, not necessarily Andrew’s.)

  • You can’t hire a colonel, you can only grow one. Andrew pointed out that members of the military spend most of their time learning. Is systems administration any different?
  • High turnover is detrimental to high trust. If your organization burns through people in months, how do you grow to trust your co-workers?
  • The people really doing the work are the best ones to determine the risks.

    This is another similarity to XP, which allows developers to estimate the amount of time that each task will take. It’s up to the customer to arrange the tasks in the most desirable order, but the customer cannot change the estimates and the developers cannot change the order of the tasks. (This requires trust which requires time.)

  • Observe, Reflect, Decide, Act, Learn. This is a pattern of behavior to use when encountering situations and making decisions.

That’s pretty theoretical, but there’s a lot to think about. Andrew also had lots of useful practical advice. One idea to consider is asking a friendly coworker in another department to sit in on an installation or change session. This has two advantages. First, if you grab someone from sales, marketing, or management (for example), you’ll show off the work you actually do in a more concrete way. Second, you can have him or her time and check off your procedures as you accomplish them, helping to verify your time estimates and keep you on track.

You can also send him or her out to buy doughnuts.

Want to hear more from Andrew on ONLamp.com?

Harold Davis

AddThis Social Bookmark Button

This is a tale of two software companies (or maybe three).

They were the best of companies, and they were the worst of companies. And they’ve yet to go to a far, far better place to peddle their software.

To get back to my story, corporations, like people, have a lifecycle with a beginning, middle, and end.

Once upon a time, a long time ago, in a land far far away, Big Blue - IBM - was the be-all and end-all of everything to do with the computer industry. Beaurocratic, ponderous, rich, and powerful, IBM watched the nimble Mister Softee - Microsoft - steal the software side of the computer business out from under it.

Mister Softee was everything Big Blue wasn’t: a teenage rebel, improvising like crazy, able to turn on a dime, handing out stock options like candy in a bowl near the cash register of a restaurant where the food isn’t so good.

Now Mister Softy has grown as soft as its nickname and frumpy, with middle-aged love handles to match. Stock options are long gone. Microsoft pays a dividend!

Microsoft wants to climb the enterprise.

This company is no longer nimble, and takes literally years to pass software through its beaurocratic process before release.

We didn’t love Mister Softee when he was young and agile, but he always impressed us with his vigor and chutzpah (though we always wished he built better, less buggy, software).

Now Mister Softee is as rich as Croesus and out-IBMs IBM. He’s dangerous! He’s fat! He’s rich! His speed of innovation is falling way behind compared to younger rivals like Google. More than ever, he’s fun to despise.

Maybe, just, maybe, Mister Softee is also starting to become irrelevant (thanks to Open Source, Linux, the Web, and Google).

Harold Davis

AddThis Social Bookmark Button

Related link: http://www.braintique.com/research/mt-archives/000147.shtml

If you haven’t tried it, the relatively new mapping capabilities at Google are very cool. I like the maps. You enter an address (or portion of one). The user interface is very sparse, with a widget in the upper left to control zooming in and out and panning across a map. Like Mapquest, you can get driving directions to or from an address. Unlike Mapquest, there are no annoying ads, pop-ups, and other distractions. You can use the Google maps to find businesses or services of a specific type in a given locale.

For reasons I can’t quite put my finger on, I think the Mapquest maps may actually be a little better for navigating by car than the Google maps. But one feature of the Google mapping application is, in fact, cool beyond belief. If you click the Satellite button on the upper right hand corner of the screen, you can see the aerial, satellite photographic view of any map. The zooming and panning tools work with these satellite pictures.

Start with where you live from above and pinpoint your block and rooftop. You can zoom in and out, see your whole city or state. Kids love this.

Some fine print: Google maps and sat photos are limited to the United States and Canada (more world coverage is promised soon). Coverage in rural areas can be spotty. This, however, corresponds rather well to the areas that are not much sought after (click here for a Google engineer’s visualization of frequency of search by locale).

More fine print: the photos seem somewhat dated (for example, big elm trees can be seen in the aerial view of my house, they came down in winter storms over two years ago). There are the usual sporadic reported glitches in the maps (this is not unique to Google’s maps).

Here’s a neat application that combines Google Maps and Craig’s List so you can view the location of Craig’s List real estate listings.

Harold Davis

AddThis Social Bookmark Button

Related link: http://www.braintique.com/research/mt-archives/000126.shtml

An extremely important part of Google’s business is famously built upon contextual advertising: Advertisers bid on keywords using Google’s AdWords software, and the winners have ads placed “contextually” on web sites whose publishers have elected to affiliate with Google using Google’s AdSense software.

But “contextually” is a significant misnomer. Computers are very good at literally matching keywords, but very bad at catching the subtle nuances of context.

As a Web publisher, you find offensive ads placed by Google. For example, on Phyllis’s HighRisk.org, a site devoted to helping parents with preemies and high-risk pregnancy conditions, we get ads for thinly disguised anti-abortionists. You can deal with this one by blocking the domains in question (Google allows AdSense publishers to block up to 200 domains as “competitors.”)

It’s a little harder to deal with what turned up when I wrote a blog entry blasting intelligent design as a euphemism for creationism. Both the blog entry and my Main blog page for the month kept gettings ads from anti-evolutionists too numerous to block by domain.

Similarly, but a little funnier, when I wrote a blog entry commenting on a business press item comparing Google to Wal-Mart, and coming down hard on Wal-Mart, and another item just blasting Wal-Mart, both my blog items and my monthly page started getting inundated with ads urging readers to shop Wal-Mart, presumabably until they drop.

Further up the black humor scale, a blog entry comparing Terri Schiavo’s fate unfavorably with being buried brain dead and coated in honey in a red ant heap draws lots of AdSense ads for ant pest control services.

Obviously, these examples are not isolated to my Web content, and are replicated millions of times over across the Web. Obviously, some “contextual” ads do work: people do click on them and end up buying goods or services. (Advertisers can measure the success rates and are not fools.)

Still, the very term “contextual” gives one hope for better, more intelligent, placements that are truly context sensitive. And, as a publisher, these stupid ads make me feel like running out and telling the world: click those ads for creationism, Wal-Mart, and ant control and cost those foolish advertisers some bucks!

Harold Davis

AddThis Social Bookmark Button

Related link: http://www.braintique.com/research/mt-archives/000142.shtml

According to a recent article in SitePro News, an online publication aimed at Webmasters who want to optimize their sites, Google’s delay in ranking sites, and the delay in according credit to inbound links, is a feature, not a bug.

I’ve written critically about the longer and longer wait times for sites to get indexed as a problem (see Is Google Painting Itself into a Corner?). Now, according Lawrence Deon, an SEO (Search Engine Optimization) expert and the author of the article titled “Surviving Google’s Aging Delay” in SitePro News, it turns out that Google does it on purpose as part of the arms race with those gaming the system. The “probationary period” makes sure that there are no instant returns for “manufacturing” tons of links, but it also makes it harder for newcomers to break into the system. In addition (this may or may not be an unintended side effect) it makes it more worthwhile than it used to be to purchase AdWords slots from Google to draw traffic, at least during the probationary period.

The idea that Google probably intends to slow indexing and ranking down, and that this is (in some ways) to Google’s financial benefit, makes me call again for publication of the details of the PageRank algorithm. (See Publish the PageRank Algorithm.) Let the antiseptic of open scrutiny and discussion work its magic on this matter that is so important to the Web!

(Adapted from an April 11 entry in the Googleplex Blog.)

Harold Davis

AddThis Social Bookmark Button

Related link: http://www.braintique.com/research/mt-archives/000130.shtml

Google Enterprise general manager David Girouard is quoted in a recent Information Week article as saying that Google’s PageRank algorithm uses more than 100 variables in its calculations.

Google’s PageRank algorithm is used for the all-important determination of how a search results are ordered. In other words, the higher the PageRank, the more likely you are to find a page using Google. Most people display Google search results ten per page. Studies have shown that there is a huge difference in the number of click-throughs you get if your result is one of the first three top-ranked pages, and also that there is close to 100% fall-off in click throughs after three pages (or thirty) search results. This helps to spell out the importance of PageRank and its gate-keeping function towards the information available on the Web.

If it is true that more than 100 variables are used to calculate a given Web page’s PageRank, then PageRank has come along way from the rather simple mechanism published by Brin and Page in their graduate student papers, and used by Google in the early days.

In the proto-PageRank system published by Brin and Page, a page’s PageRank is a fraction calculated recursively by summing the PageRanks of the pages that link to it, and applying a simple damping factor representing how likely it is for anyone to surf away from a given page. In this theoretical Web universe, the sum of all PageRanks is always 1. Here’s some material from Building Research Tools with Google for Dummies about how Google works.

It’s amusing to note that the term “PageRank” was probably coined to reflect Larry Page’s role as the creator of the concept rather than because it is about ranking pages.

There is something deeply troubling about the complex and opaque nature of the 100+ variable unpublished PageRank algorithm as it stands today. In effect, this means that nobody (except Google insiders) understands how information in this most important of information portals passes the gate keepers.

It’s probably unreasonable to expect Google to publish how PageRank really works in light of competition from other search engines, and the efforts of SEO Webmasters to game the system. But not publishing the details of the PageRank algorithm goes against the tenets of open source espoused by many who work at Google, violates the idea that information should be freely available (after all, this is a most important piece of meta information!), and deprives Google of the open-source-like benefits of community scrutiny.

So I say, free the PageRank algorithm now!

(This post is adapted from a March 31, 2005 Googleplex Blog entry.)

chromatic

AddThis Social Bookmark Button

Related link: http://jan.kneschke.de/projects/mysql/#3

I’ve seen two other ways of storing and retrieving hierarchical data structures into relational tables. One of them is using multiple queries to retrieve children, recursing in the application. Another way is to add metadata to each row to allow value comparisons when retrieving children.

The problem with the first approach is that it requires multiple trips to the database. The problem with the second approach is the bookkeeping required to modify the metadata when you add a child to the left of other children. (Think of inserting into a b-tree, for example. I suspect there’s an algorithmic improvement lurking somewhere in the choice of numbers used in the metadata so as to reduce the need to rebalance frequently.)

The technique Jan demonstrated defines two stored procedures and calls one recursively to handle fetching children of the current node. That’s one SQL query, which is nice, and it avoids the need to insert and adjust metadata on table writes.

That seems like a nice solution to me.

What do you think?

Andy Oram

AddThis Social Bookmark Button

Related link: http://www.mysqluc.com/

MySQL is graying in several metaphorical ways. Of course, it is simply
getting older–aren’t we all? But it is by no means over the hill.
More significantly, its adherents are getting less colorful and
reflect instead the grayness of the corporate settings it is
conquering. Finally, MySQL is graying the distinctions that separated
it from Oracle and other heavy-duty database engines. MySQL, in short,
is becoming conventional.

The early achievements of this disruptive technology were to bring a
high-performance relational database down from the top shelf where
only those of means could afford it, and put it in the hands of
students, enterpreneurs working out of their homes, and modest web
site developers. This was a revolution dubbed situated software by Clay Shirky. Although MySQL was already being used by sites that
could afford more expensive databases (and the computer systems and
expert administrators who came in tow), these did not drive its
initial popularity.

Now MySQL AB has built a formidable marketing machine and carried
their product into the database mainstream, following a path similar
to Linux. Their trappings are starting to evince familiar themes. They
have salespeople in at least a dozen cities around North
America. Their new support and update mechanism, MySQL Network,
reminds me of a similarly named support system from Red Hat. MySQL’s
development of an online FAQ called a Knowledge Base, and the slogan
“MySQL Everywhere” plastered all around this conference, are
reminiscent of another large software vendor.

But MySQL AB has not forgotten the little guys who want a DBMS that
runs lean and fast, with near-zero administration. These users will
probably continue to be its largest base. Significantly, under the
conventional trappings I mentioned, I believe MySQL AB is still
structured in a fundamentally different way from a conventional
propriety vendor, and is still behaving like a network of brilliant
independent software developers. They have always listened closely to
their users–you can see that at their conferences, where dozens of
developers turn up in distinctive shirts and attract flocks of
petitioners for new features–but they now are listening to paying
customers in the same intense, investigative manner.

For instance, I saw one of their leading engineers walk around an
evening reception recruiting representatives from international
customers to sit in on a session about internationalization, just so
he could hear their perspective on some problems he had been told by
other customers.

It takes a certain financial and time commitment to attend a
conference, so for those who pony up the money to do so, the theme at
this one is “bigger and better.” Sessions on Java interfaces,
clustering, scaling, high availability, and replication decorate the
calendar for the next few days. One panel is even called “Challenges
in the Enterprise.”

And what are the newest features MySQL is pushing hardest? There are
no breakthroughs here (and I wouldn’t expect any, because relational
databases are a mature area in a research sense). The announcements
focus on things that competitors have had for years: stored
procedures, triggers, views. MySQL is not leading the conquest of new
territories. Rather, MySQL is catching up. That’s something they’re
proud of, and rightfully so.

I attended one session last night on a feature that will implement a
tiny snippet of the SQL standard, XPath support. In effect, MySQL,
which has always understood the SQL language, is learning a second
language–not a natural language (although MySQL offers increasing
support for character sets and other internationalization features)
but the complex world of XPath.

I find this feature an odd way to support XML. Most XML users carry
out XML/database interaction by using Java or some other programming
tool to break down the XML into constituent pieces of test and store
those pieces in a database structure that mirrors the XML. But SQL’s
XPath support buries the XML without alteration into a field in the
database.

The idea of XPath support in the database is that you start by storing
a string such as

<p>Why do <em>you</em> want to represent <em>structured text</em>?</p>

bodily in a text column. This text column can be any standard text
datatype in SQL (although MySQL will add a special XML tag eventually,
to support validation and some optimizations).

In itself, this doesn’t help deal with XML. But MySQL will also
provide a couple functions such as ExtractValue and UpdateXML that
manipulate the XML with XPath queries. You could tell it to extract or
change, for example, the second <em> entity in the string just
shown. Full text searches can reduce the time it takes to search large
collections of XML by two orders of magnitude, in comparison to
database queries without indexes.

The design of the XPath support is oddly disconnected from the
traditional structures of a relational database. As already shown, the
storage model jams all the XML into a single column, so that the XML
structure is handled independently from the schema of the table.
Furthermore, an XPath query that returns multiple strings from
different parts of the XML document concatenates them together,
space-separated, in a single row. I would have expected them to be
granted individual rows in the results.

There are many uses for XPath support in a database. One could extract
and display all the titles of different documents. One could run a
traditional SELECT to retrieve data from other columns or tables and
join it to XML content. One could find everything within
<price> tags and let the database perform some
calculations such as averaging. The more XML processing you can do in
the database, the less data has to be sent over the wire to the
client.

This new MySQL feature–not planned until 5.1 or even later–is
probably less useful with data-crunching XML (which has many small
pieces of text within multiple tags) than with documents, which are
flatter and have a high ratio of content to tagging. However, one
participant in last night’s BOF suggested the feature could be applied
to storing SOAP queries too.

MySQL’s turn to the mainstream is being reciprocated by its intended
audience. Attendance at yesterday’s tutorials was impressive; a couple
tutorials sold out, and the halls were filled with people at break
time. Today’s sessions and exhibitors will draw even more.

Ming Chow

AddThis Social Bookmark Button

I used to have two hard drives on my desktop –a 60 GB, and an 80 GB. I used the 60 GB for Linux and the 80 GB for Windows. One problem I always had was utilizing all the disk space. It was much more than I need! I manage all my digital photos and music on my Apple iBook, and I do not write, nor save, many documents to my hard drive.

Several months ago, I switched to Linux (Fedora), and sold my 80 GB hard drive for petty cash. Yes, I securely “shredded” the data on the hard drive before I mailed it out, thank you very much. Life is going well on the Linux end, but there are some things that I really miss on Windows, particularly a quick fix of playing NHL 2004 and Rollercoaster Tycoon 2 for leisure.

Fortunately, I kept a 10 GB hard drive from my old PC that I disposed several years ago, as an emergency hard drive, just in case I ever needed it. I never used it for emergency purposes, and I considered using it as a doorstop on numerous occasions. Luckily, I didn’t do that, and I decided to do a fresh install of Windows XP Home Edition (with Service Pack 2) onto the drive. I also decided to install all the programs that I need in Windows onto the hard drive. To be “secure,” I decided that the system will not connect to the Internet.

Here is a list of all the important Windows applications I need:

  • Windows XP
  • WinXP Service Pack 2
  • DVD Software
  • Roxio
  • Ad-Aware
  • Adobe Acrobat Reader
  • Blender 3D
  • Cygwin
  • Eclipse
  • Firefox
  • GAIM
  • Java SDK
  • Microsoft Office XP
  • NHL 2004
  • Norton SystemWorks
  • nVidia GeForce2 Driver
  • RollerCoster Tycoon 2
  • SSH Secure Shell (non-commercial)
  • Winzip
  • ZoneAlarm

It didn’t take too long to install everything listed above onto the 10 GB hard drive –considering most of applications listed are free: I downloaded and burned the latest version of them onto a CD beforehand.

Finally, I cleared out my temporary files folder(s), ran the routine spyware and virus checks just to be safe (even I am not connected to the Internet), and defraged my hard drive. All the applications were installed successfully onto the hard drive, and I had a good 4 GB left!

Now I can enjoy playing NHL 2004 and Rollercoaster Tycoon 2 again, as well as the peace of mind that I can fall-back or even resort back to Windows when I need to. What I learned: don’t waste computer products, such as an old hard drive; and you really don’t need a monster hard drive to do everything in the world that you want!

Matthew Langham

AddThis Social Bookmark Button

It is not without a warm fuzzy feeling of satisfaction that I see O’Reilly has now put up an official EuroOSCON 2005 page with a call for participation. EuroOSCON 2005 will be in Amsterdam from 17th-20th of October. I’ve been involved in this “project” for the last 3 years - ever since I wrote this post back in 2002. Plenty of emails, talks at EuroFoo last year and meeting up with the Gina and Nat last week at OSBC have finally paid off. So excuse me while I dance a little jig around the table.

brian d foy

AddThis Social Bookmark Button

Related link: http://boss.streamos.com/download/interscope/nin/with_teeth/nin_garageband.sit

Nine Inch Nails’ Trent Reznor released “The Hand that Feeds” as a GarageBand song. The NIN website has a 70Mb Stuffit archive. He says in the README:

For quite some time I’ve been interested in the idea of allowing you the ability to tinker around with my tracks - to create remixes, experiment, embellish or destroy what’s there. I tried a few years ago to do this in shockwave with very limited results.

After spending some quality time sitting in hotel rooms on a press tour, it dawned on me that the technology now exists and is already in the hands of some of you. I got to work experimenting and came up with something I think you’ll enjoy.

The license says you can play with the files, but you can’t redistribute it or use it for commercial use without a different license. Still, it’s very cool.

Mac and Perl hacker Chris Nandor already has his own version as an MP3.

Schuyler Erle

AddThis Social Bookmark Button

Related link: http://cu.convio.net/site/PageServer?pagename=HUN_Community_networks

As you may know, the slumbering dragon known as the telecomms industry in the US has finally woken up to wireless community networking, and begun to lobby the various state governments to protect them, particularly against municipal plans for free wireless networking, such as those put forward in my old home town, Philadelphia. The Consumers Union have put a website at HearUsNow.org to help people send a clear and direct message to the decision makers in state government that wireless networking offers the potential to linking communities, encourage the spread of broadband, and perhaps even bridge the Digital Divide… Whether you live in Pennsylvania or not, now is the time to stand up for wireless community networking!

brian d foy

AddThis Social Bookmark Button

Related link: http://www.dailypennsylvanian.com/vnews/display.v/ART/425e17f82bf08

The Daily Pennsylvanian quotes TiVo CFO David Courtney saying:

“We haven’t committed to any plans [for integration] to [the Mac] because of the cost,”

along with


“unless we find a way to record it under the current platform, and I don’t think that will happen in the next few years.”

That might as well be never in internet time. It’s not the technology that’s the problem (or the cost). It’s the rights management that’s fouling the works.

I can download to my Mac shows recorded on my TiVo. I just can’t watch them. TiVo is effectively saying I’ll never be able to watch them on my Mac.

AddThis Social Bookmark Button

Last night during Veronica Mars (a show about a high school girl, the daughter of a private detective, who is investigating the disappearance of her mother and the murder of her best fried) I heard my first mention of Linux, and specifically Ubuntu, in a television show.

In this weeks episode the school has been receiving bomb threats. As Veronica is walking towards the school’s computer lab she hears two people arguing (I won’t get the dialog exactly right):

Male voice: If you haven’t even tried Ubuntu how can you say you don’t like it? It had the 2.6 kernel and Gnome 2 on the day Warty Warthog was released.

Female voice (who turns out to be Mac, the resident girl geek Veronica turns to for computer help): I’m happy with OS X. It’s got (something about all the awkgrep that I need) and I don’t have to worry about font de-uglification.

Male: You don’t have to do font de-uglification anymore, and it’s free! You’re living in the dark ages!

Mac: I don’t care, I know what I like and I like what I know.

The male Ubuntu geek turned out to be the person making the bomb threats, but the threats weren’t real. He was trying to frame a bully who had beaten him throughout Junior High. Kinda interesting considering Ubuntu means “humanity to others.” Mac, of course, means “Macintosh”. Windows XP means, “I’m not a cool enough OS to get mentioned or used by any of the hip characters on the show.”

brian d foy

AddThis Social Bookmark Button

Related link: http://www.oreillynet.com/pub/wlg/6837

Last week I bought the lowest-level, cheapest Dell desktop computer I could get so I could test some Perl software and some web pages on Windows.

It got here on Tuesday: at the earliest time in the estimated delivery and quicker than UPS could update its tracking. Very nice. However, I didn’t get the flat-panel monitor I thought I was getting: they sent a 17-inch CRT. I looked back at the order and read the fine print. Yep, they snuck the CRT in there under four in-house ads saying I could get a free flat-panel monitor (* if I bought a different computer). Oh well, I only paid $450 and the monitor was just gravy. I already have a monitor for it.

Physical set up is easy, and this computer looks really ugly. There’s a plastic facade on the front of the tower that adds two inches, and the monitor and mouse both have cables snaking to the back of the tower. I think I’ll have to get one of those wireless mice (or maybe I’ll see about adding a bluetooth keyboard and mouse).

The software setup was no problem, and I already had ActivePerl and cygwin (along with a few other things) burned to a CD. I didn’t want to hook this think up to the network just yet. I ended up wasting an hour playing Minesweeper. I could play that game all night.

Then I installed FireFox, but not just any version: the Irish version. What the heck, I’m already on Windows, and as the Outward Bound folks say, “Get into what you can’t get out of”. I’m setting myself up for pain, so let’s do it in Gaelic.

That was yesterday. Today, I figured I’ll try to connect it to the network. What a pain! I connected to my network through my Airport Express. I ran the Network Setup wizard, restarted, and badda-bing badda-boom, it picked up its DHCP address and router. The only problem was the crappy ISP nameservers which were down. No problem, I figured, I’ll just manually add a couple of nameservers. No I won’t, I guess. I couldn’t find any place in Network Setup to change it. I checked the Microsoft Help thingy, and it returned a document from the MSN Knowledge Base telling me which Registry Key to edit with regedit. Ugh.

I tweaked my Airport Base Station to give out new addresses for the nameserver, restarted that, and restarted Windows (which I needed to do anyway so some other software could do its magic). Now the Windows box is on the network. Well, it’s almost on the network. I’m pretty sure I followed the different instructions to create a network share, but no dice. I’ll have to think about that later. Maybe I should just figure out how to set up an sshd instead.

Oh well, things aren’t so bad. By next week I should be getting some work out of this machine.

Mark Finnern

AddThis Social Bookmark Button

Related link: http://www.futuresalon.org/2005/04/fab_friday_.html

You may have missed him at Etech and now his book is out: Neil Gershenfeld head of MIT’s Center for Bits and Atoms will present FAB:The Coming Revolution on Your Desktop–From Personal Computers to Personal Fabrication at the Future Salon this Friday at SAP in Palo Alto.

Personal Fabrication and Neil’s book is something that makes me excited about the future as nothing else lately. It changes your mindset to the limit is not what I can buy, but the limit is what I can imagine and then let’s get to it and build it. It’s a renaissance of tinkering, dust off your old workbench and Make.

Later more, I just wanted to get this out now.

Here are the details: Tomorrow Tax Day: Friday 15th of April 6-7 networking with light refreshments proudly sponsored by SAP . From 7-9+pm presentation and discussion. SAP Labs North America, Building D, Room Southern Cross, 3410 Hillview Avenue, Palo Alto, CA 94304 [ map ] As always free and open to the public.

Update: We have a new RSVP Service . Please use it so that we can calculate the food and whether to use the larger room. Improve your commute by sharing it with a fellow Futurist. Check the Ride Board for opportunities.

If you are not in the Bay Area, we will Webcast the talk and discussion: 7pm to 9pm

Point your Windows Media Player: http://mfile.akamai.com/14947/live/reflector:39875.asx?prop=n
(This one is different then last time. We are streaming it via third party to make sure as many as possible can take part in the event. Thanks to SAP for sponsoring this extended service.)

We will also have an IRC chat session running for questions to Neil:
Server: irc.freenode.net
Channel: #futuresalon

[More]

brian d foy

AddThis Social Bookmark Button

Marketplace reports that Intel doesn’t have a copy of the April 19, 1965 issue of Electronics where Gordon Moore first published Moore’s Law.

They further report that if you have one, Intel wants to give you $10,000 for it. I couldn’t find details on the Intel site, so if you’re the packrat who has the issue, you’ll have to do the leg work yourself. :)

Uche Ogbuji

AddThis Social Bookmark Button

Related link: http://c2.com/cgi/wiki?WhatsWrongWithEjb

In a recent client project we’ve had to do an assessment of EJB versus other technologies for a specific application. Not much new there, except that this time the primary alternative was RDF/OWL. Most of the resulting discussion was based on our own specific experience with EJB and Semantic tech, but I did stumble across this interesting Wiki. All over the place, as well-trod Wiki pages tend to be, but a thought-provoking read even so.

brian d foy

AddThis Social Bookmark Button

I can use iChat to send SMS messages to my phone, and my phone can send an SMS message back to my iChat.

First, under File>New Chat With Person, I added my phone number with a leading +, 1, the area code, and the seven digit number (so, it looks like +1xxxxxxxxxx). I can’t immediately chat with my phone because of my privacy settings, but I can add my phone as a buddy. Once I do that, I type in a message and it shows up on my almost immediately as a text message on my phone.

image

If I reply to the message from my phone, the message shows up in the iChat window just like any other chat session.

image

This is more than just the standard geekery though: this means I can easily send a text message to my wife without dealing with my phone. There were probably other ways to do that easily (such as T-Mobile’s web interface and also email), but I like how iChat adds my buddy name in front of the message rather than some odd gateway name, and that I get the reply back in iChat. Very handy.

brian d foy

AddThis Social Bookmark Button

Related link: http://www.theperlreview.com/Interviews/mjd-hop-20050407.html?orm

Read my interview with an interview with Mark Jason Dominus about his much anticipated new book, Higher Order Perl.

This is the first in a series of interviews that I have lined up.Between issues of The Perl Review, I want to add more free content like this. Enjoy!

brian d foy

AddThis Social Bookmark Button

Related link: http://www.hitachigst.com/hdd/research/recording_head/pr/PerpendicularAnimation.…

Hitachi presents its perpendicular storage strategy in the style of Schoolhouse Rock.

They said last month:

In March 2005, Hitachi Global Storage Technologies demonstrated an areal density of 230 gigabits per square inch (Gb/in2) on perpendicular recording technology, the highest areal density achieved to date based on vertical recording. This accomplishment represents a doubling of today’s highest data densities on longitudinal recording technology.

Uche Ogbuji

AddThis Social Bookmark Button

Related link: http://pyblosxom.sourceforge.net/

My brother Chimezie and I recently launched a blog (”Copia”). We’re both blogging newbies, besides my using these excellent facilities on O’Reilly Network. I had a lot to learn, but settling on a simple, small, and very hackable blog engine in PyBlosxom 1.2 made it a pleasurable experience. I have several notes on my experience setting up PyBlosxom. Maybe these will help others get their feet wet.

What blogging tools have you found to be most hackable?

brian d foy

AddThis Social Bookmark Button

I’m a Mac user, and I just bought a Dell. I’m not going to run Linux on it. I want it to run Windows. It’s a little test machine. I must feel some latent guilt about this, because I feel this need to come clean. The most I have to do with Windows is read Preston Gralla’s weblog.

I’ve done this with Virtual PC before, but I never really liked having an emulated Windows: it’s as slow and painful as a spyware infected PC. Virtual PC costs a bit over $200, although I didn’t shop around. I bought a low end Dell Dimension 2400 with a Windows XP and 512 MB RAM upgrade for $450. The real thing comes with a 17-inch flat panel monitor for “free”. I’ve wanted another flat-panel for a while, so the Dell is already as expensive as Virtual PC and the monitor together.

I had an amusing time buying the thing. What’s with all the choices? Steve Jobs just says “You pick one of these and we send it to you”. I can get any color I like as long as its blackaluminum colored. Dell lets me pick a CPU too. You mean there is more than one? Holy canoli. I don’t even think I know what I have in my, um, G4 except that it’s a G4 something-or-other. After that I have to pick a processor speed? Again, I don’t even know what I have now. I could choose from 5 or so “productivity” suites. It was interesting seeing all the different choices, even if I really just wanted to click a “I’m a cheapskate” button and skip all that stuff. I can definitely see how people might get confused about buying a new computer.

I think it’s cool that I can do all that, and I remember liking Michael Dell’s book about his just-in-time supply chain. It’s all cool stuff. I’ve been indocrinated otherwise though. Choice just isn’t in my Mac vocabulary. The experience wasn’t as difficult as buying a train ticket on Amtrak’s website (I wonder if they are actually trying to get de-funded), but I did have to go through a couple of steps of “Are you sure you don’t also want to buy a …”.

So, somewhere, a Brown truck is pulling up to a Dell location and someone is loading my computer onto it. Some third shift worker is picking up my box and setting it in the truck. This isn’t Gateway, so I’m not imagining so beefy guy carrying my PC while he runs to my house. Somewhere near that truck is a bigger truck going to Chicago. It’s almost romantic. Dude, I’m experiencing the Dell supply chain.

I should have it on Monday (I skimped on the shipping, like everything else). I don’t know if I actually want it right away, because I’m not looking forward to setting it up on my home network. I don’t want it ever to see the internet, and I don’t want the internet to see it. I hope I don’t have to connect to the net to configure it. That would certainly suck.

I’ll see what happens.

Kevin Shockey

AddThis Social Bookmark Button

If day one of the Open Source Business Conference was summarized by great optimism, then, at least for me, day two was punctuated by reason for concern. My concern stems from our willingness to believe that open source cannot be stopped. That may be true, but there are certainly some circumstances that might significantly slow the momentum we now see.

To start day two, both IBM and Oracle were scheduled to kick things off. Although Matt Asay started out the conference on Tuesday claiming that none of the keynote speeches had been “bought”, it is difficult to understand the why and who of some of the keynotes. Wednesday morning started with several speeches which I felt did not add a great deal to the conversation occurring at the conference. I do have to admit that I skipped the Oracle keynote, so maybe I missed something. However, a friend of mine sat through the speech and told me that it was definitely the worst of the conference. I can believe it. I did attend Dr. Irving Wladawsky-Berger’s presentation on “Innovation in an On Demand World: The Future of IT.” Dr. Wladawsky-Berger is Vice President of Strategy and Innovation for IBM. For me this presentation did nothing more than repeat what we all already live and breathe on a daily basis. It was, however, interesting to see Dr. Wladawsky-Berger contrast Kim Polese’s use of the builder metaphor with an engineering metaphor for the future of innovation. Two things to consider from Dr. Wladawsky-Berger’s presentation was his emphasis on innovation in the business processes (think management as well) of our enterprises. In addition, he mentioned that IBM was doing research on self-managing systems. I found this an extremely interesting concept. With all we hear of the enterprise challenge of managing the installed infrastructure, these iisues could be very helpful in the future of the enterprise.

Even though enterprise strategy was only one track, it seemed to be consistently present throughout the conference. Most of the keynotes were focused on the impact of open source on the enterprise as well as most of the companies in the emerging technology and start-up showcase. Some say this focus stems from the dissatisfaction most CIOs have with the lack of return on their enterprise level investment in information technology. At the SpikeSource Town Hall meeting, I heard first hand how upset they are. Therefore, many people see open source as the solution to this concern. We must remember one fact, however. Open source technology will not replace enterprise legacy investments. Very few organization are going to rip out existing solutions to replace it with an open source equivalent. I see a bigger need in applying open source technology to improve the efficiency, improve the operational stability, and reduce the cost of over-all maintenance.

As I mentioned above, the second day balanced the first day’s optimism with a dose of concern. The source of my concern came from two subsequent sessions. First I participated in Matt Thompson’s session on “The Role of Open Source and Community Development in Emerging IT Markets”. Matt is Director of Technology Outreach for Sun Microsystems, Inc. I went to this session to learn more about the dynamics of building communities, but came away with something completely different. One of Matt’s main premises was that “Many 2nd and 3rd world countries are leveraging Open Source for accelerating the sophistication of their IT industries.” Matt proceeded to present his evaluations of China, Malaysia, and Phillipines and how they view open source and open source licenses. First he said, they recognize the opportunity within open source to complete their infrastructure with out paying for the development of the intellectual propoerty (IP) necessary. Open source grants them the IP and they are freely adopting anything they can to accelerate that process. Second, they are not too concerned about the open source definition and the legal implications it carries. For these countries, they do not feel any obligation to donate any code changes back to the community. Many in the audience were both offended and surprised at this statement. Later in the day, the risk of this situation would come into clearer focus.

To essentially finish the conference, Professor Lawrence Lessig presented a call to arms in the battle for the right to innovate. In the final keynote speech, Professor Lessig discussed the battle currently under way by reviewing the cases for Betamax, Elder, Napster, and INDUCE. With these cases as a precedent, he theorized that Microsoft could assume an offensive strategy. This strategy would include collecting as many software patents as possible, hiring as many patent attorneys as possible and then use them to protect their monopoly. Even though only a theory, most of us are already aware that the first two premises are true. In the end, Professor Lessig admonished the audience to consider their actions more carefully and answer the following question: “What happens if we lose this battle?” If you are uncomfortable with the answer, then ask yourself: “What are you doing to support companies that are actively fighting this battle?”

To sum up the conference, I’ll borrow one of the key tags from a comic book and movie we are all familiar. Open Source represents great opportunity, but with that great opportunity comes great responsibility. If we want to preserve this great opportunity, then we must actively defend our freedom to pursue that opportunity. We can do this by ensuring that we comply with the implicit obligations of the software licenses we accept and support those organizations that protect our freedom to choose that software in the first place.

Do you believe that Microsoft can stop open source?

AddThis Social Bookmark Button

A friend of mine is a metal fabricator and musician. He’s recently taken up lock picking, or ‘Steel Bolt hacking’ in the hip lexicon.

He had a contractor working on his house. They got along well, and the contractor was sympathetic to the financial difficulties of owning a house. One day he brought over a box of locks. He said that locks were expensive, but if you learned to pick locks you could go to the construction surplus supply yard and get high quality locks for cheap…just a couple of bucks, but without keys.

If you could pick the lock you can open it up, and rekey it.

The fellow also loaned Tom some tools. Tom took to it and promptly fabricated new picks and tension bars (torque bars?). And tonight after we’d chatted for a bit about nice things he sprung his surprise on me, with an overly casual ‘oh, by the way, I’ve been messing around with lock picking.’

How F….g Cool! My jaw dropped. Tom is emphatically not a computer person, but considering the extremely diverse Call for Papers for the upcoming What the Hack conference in the Netherlands I promptly invited him.

Plus I watched carefully. I learned how to rake a lock, and was able to open a Yale padlock and a Schlage Deadbolt.

It was great!

Tom still had the original borrowed pick manufactured by a ‘professional.’ I tried it out, and realized that it was fundamentally inferior to the pick that Tom had fabricated. Primarily because of the handle. The ’store bought’ pick had a thin handle that wanted to flip in my fingers. Tom’s pick has a nice solid handle that offers better control.

I have some notes on how he fabricated his picks, and perhaps I will be able to replicate his design.

So Tom got a call from a musician who was a hero to us when we were in High School. He played me the message. He wanted Tom to play a studio session, which Tom was excited to do. He took his practice locks and picks in order to relax between songs. The musician commented that “Lock picking was the new knitting.”

Speak and the world wanders by, remain silent and, uh, well the metaphor broke down, sorry.

Kevin Shockey

AddThis Social Bookmark Button

I would have to say that so far Kim Polese and her team have made the biggest impression on me and perhaps on the conference attendees as well. Yesterday they had a large team in place, and even larger if you consider that most of their Advisory Board was in attendance. With that large of a team, I’m sure there was a lot of buzz surrounding their announcements.

Most significant of those announcements was the announcement of the general availability of their SpikeSource Core Stack. In addition they announced a broad range of enterprise support and related services to enable enterprise IT departments to build and deploy open source solutions with greater confidence. It is in this area that I find something truly innovative and compelling. Their announcement of testing as a service is something that could significantly impact the way software is built and validated.

One market they may want to include within their focus is the “medium” size business with an internal software development capability. The size of the development shop I’m thinking of is between 10 and 20 developers, coders, analysts, and testers. As a previous director of software development management, I can appreciate the value in having a testing service available. I can appreciate the ability of a development shop to validate an internally built solution and a service to inform me of the fixes most critical to the operational performance of that system.

I want to thank Murugan Pal and Glen Martin for taking the time to answer all of my questions and for sharing their passion for the problem space and how it impacts thousands of companies. Murugan has also expressed interest in making one of my projects original dreams come true.

From the beginning, the founders and I of the SNAP Development Center hoped that other academic institutions would find value in the model we created. By placing an open source project within a university, we bring a whole new dynamic to the education of computer science and engineering students. The opportunity to work on real software, deal with real business issues, confront real customers is an experience that is immeasurable. Murugan immediately saw the value of this model and suggested we work together to port the approach to India. This is an opportunity that is far beyond anything else we could ever hope to accomplish in the SNAP Development Center. I look forward to working with Murugan, and maybe even a trip to India. According to Ray Lane it is an experience that every American should take. When do I leave?

If you are a SMB development shop, do you think SpikeSource could help improve the quality of your apps?

Kevin Shockey

AddThis Social Bookmark Button

For many in attendance at the conference, the themes discussed and the excitement of the event are nothing new. However for me, amidst a whirlwind afternoon, it is sometimes difficult to pinpoint the relationships that tie everything together; Especially so soon after experiencing them.

Certainly low hanging fruit for my experience is the influence of open source software in commoditizing the software industry. Starting with Kim Polese’s keynote speech, “Coping with commodities in the new IT Marketplace”, or as she summarized in her conclusion: “Coping with great opportunity.” Kim offered up an interesting comparison of the construction industry as an analogy for what she sees happening in IT. As Robert Lefkowitz would later comment in his presentation, she completed her obligation by quoting Doc Searls for drawing her attention to the use of construction related titles in the software like builder, developer, and architect.

She follows the analogy through to illustrate the commoditization of building materials, and how ultimately this enabled the creation of the largest industry in the world. So the inference is that this should happen as well with software.

If nothing else, listening to the speakers today leaves one with a great sense of optimism for open source and the software industry. Something Kim offered early in her speech was the prediction that there would be more money made because of open source than from it. This is certainly true for Google and similar web sites running a mostly open source stack of software.

I truly enjoyed Kim’s choice of using the work of Hugh MacLeod from the Gaping Void to illustrate her slide deck. I really enjoyed the graphics, and I think it complemented her optimistic message.

After this session I caught Robert Lefkowitz and his discussion of the “The paradox of choice.” Robert is an extremely polished speaker with an even more polished set of ideas, relationships, and conclusions. It goes without saying that his presentation was thought provoking, if a little confusing. I’m sure he knew that some in the audience would get lost in the twisting loops of his thought process and in typical style used that to prove his point.

I’m thankful for Robert for taking questions. He answered one that has been bugging me for a while. I wanted to know how software could become a commodity in the same way as other commodities like orange juice. He offered that it was not the actual software that was becoming interchangeable but the providers of the software. This clears up what I believe most people are referring to when they discuss software commoditization however it is contradictory to Kim’s construction industry analogy. I think that there is a little of both going on, and maybe SpikeSource will help on the software side. Then again, maybe Tim O’Reilly’s vision of web services holds the ultimate view of what software commoditization holds. When I no longer care how functionality is provided to me then I’ll accept that software is interchangeable.

Finishing the day was the long anticipated presentation of Geoffery Moore about why he believes open source has crossed the technology chasm. I’ll have to get some sleep before I think I can give a good review of his speech. He offered some great perspectives that I hope to share tomorrow.

Finally I attended the SpikeSource Town Hall Meeting. My many thanks to Robyn Forman for the invitation. This was another very deep discussion which needs special coverage. The meeting was aimed at the many CIO’s in attendance and included a lively exchange of experiences and straight from the hip comments. I believe that most in attendance came away with a sense of some of the issues facing CIO’s in the enterprise market.

Do you share the sense of optimism?

brian d foy

AddThis Social Bookmark Button

Related link: http://www.idlewords.com/2005/04/dabblers_and_blowhards.htm

Maciej Ceglowski, a Perl programmer and painter, gives a counterpoint to Paul Graham’s Hackers & Painters.

In short, he thinks the metaphor went to far and isolates a connection that isn’t all that special. It could have been “Hackers & Pastry Chefs”, for instance.

I like both essays, and I’m not picking sides.

brian d foy

AddThis Social Bookmark Button

Related link: http://alpha-geek.com/2005/04/05/uninteresting_programming

One of Jeremy Smith’s friends say programming isn’t interesting, and its not where the “buzz” is. I say that programming is not meant to be interesting: it’s what you do with it that counts.

Someday I’ll sit down and put down all of my thoughts on how people choose their first programming language. I still think it really has nothing to do with how good that language is. It seems to me, in my very unscientific and unmethodical observations, that people either reach for the closest language (i.e. the one with installed tools or the one their buddies use) or the one that matches the lifestyle they want (i.e. “I want to be a hip web programmer with yellow tinted shades”).

Part of second group is the geeks on the bleeding edge who can’t commit to one (or even a handful of things). They are that fifth standard deviation (the one on the high side!). They are the people who have already abandoned the technologies that most people are just starting to use. I tend to think they use new things just because they like using new things. “Do something useful? Pshaw!”

The danger comes when the middle part of the curve chases that lead group, or when that lead group thinks the middle group should follow them. It’s okay that the alpha geeks are on the frontiers exploring new things and creating new technology: it’s a necessary part of the ecosystem. A lot of the times, however, I think their choices are motivated by the desire to learn and be different than create something, which is the flipped-around for the other group, who just try to make it through the day without breaking anything

So, let’s make it personal. It’s not the passive “programming is boring”, it’s “I’m bored with programming”. For a lot of people, programming is not the point, and it doesn’t have to be the source of passion.

Kevin Shockey

AddThis Social Bookmark Button

With keynote speeches by Jonathan Schwartz and Larry Augustin this morning, the Open Source Business Conference opened with a bang. Each in their own way, they offered predictions for the future of the software industry. Jonathan offered this prediction: “Every industry that touches the Internet will eventually see their products trending to free.” He used free cell phones as the example and used an analogy for GM theorizing how to offer free cars as the promise.

Larry offered up his prediction for the next wave of open source applications. He offered six trends that he observed from SugarCRM, Compiere, Asterisk, and Vista. To predict where the next wave of open source application explosion he offered the following patterns to look for:

  1. Traditionally a big expensive heavy application, long sales cycles, hard to implement and install
  2. Competitors are big, traditional, proprietary competition and use traditional enterprise sales cycle
  3. The open source project has a large, enthusiastic free user base
  4. The project has an enthusiastic developer ecosystem
  5. The vertical market represents a big enterprise market opportunity
  6. That market is currently under penetrated in the SMB sector

Although Jonathan’s prediction is a little to comprehend at this point, I’m positive that Larry’s is 100% right on the money.

The afternoon and evening are looking very SpikeSourcish, with Kim Polese’s keynote address and the SpikeSource Town Hall meeting with Ray Lane. In addition, the much awaited keynote address from Geoffrey Moore. So what are doing to celebrate, now that open source has crossed the chasm?

Any surprises other than SpikeSource?

Kevin Shockey

AddThis Social Bookmark Button

Related link: http://www.spikesource.com

Last night SpikeSource unveiled their first surprise for today. During the middle of they launched a completely redesigned web site. Finally, the picture starts to come into focus. From my initial analysis they seem to want to, at least at this point, become a development shops best friend. This could become a sly move. As I have always said: “he who controls the developers, controls the market.” Clever indeed.

For the curious they have announced some pricing information which finally shows some information about their direction and plans. I wonder how much a gold or silver support plan might cost?

For developers they have launched:

  • MySAM
    … which helps developers keep track of the open source components in your company. An analysis tool that helps you understand the risks presented by security vulnerabilities, bugs, and updates for these components.
  • Spike PHPCoverage
    … provides the first and only tool available for measuring and reporting code coverage provided by the test suite of a PHP application. This tool records the line coverage information for any PHP script at runtime. Spike PHPCoverage is available on Sourceforge.net as an active open-source project.
  • Test Upload Service
    … is a free automated testing service for applications built on the SpikeSource stack. Tests include unit testing, functional testing, code coverage, and test execution and reporting across multiple platforms. Completion and failure reports are available by email and as a hosted service provided by SpikeSource.

And finally, for independent software vendors, SpikeSource helps ISVs embrace open source with their ISV Certification service.

ISV Certification offerings include:

  • Stack Validation
    … provides testing and validation of open source components and revisions. Stack Validation is priced per configuration, and includes periodic revalidation.
  • Secure Computing Alerts and Fixes …provides update and configuration tooling, plus access to an ongoing stream of component updates. It also provides continuous alerts on defects and corrections.
  • Application Validation
    … provides ongoing testing of your application together with our recommended stacks. It does, however, require the Spikesource Secure Computing service.
  • License and Provenance Management
    … streamlines your developers download process and improve your visibility into license requirements. SpikeSource can provide single-source open source technology downloads, filtered to your approved licenses or technologies, and help manage approvals and recording.
  • Component Evaluations
    …provides analysis of the impact of adding a new open source technology to a current or future product.
  • SpikeInfo
    …provides access to the results of the SpikeSource testing results database. These results are integrated into other Spikesource products and features.
  • SpikeSearch
    …provides access to an extensive collection of articles from in-house experts, news group postings, mailing lists, and RSS feeds.

Finally, the missing blogs are now available. The first SpikeSource blog, which will provide thoughts and comments about the open source community, enterprise IT, and the evolution of software assembly and testing. The second Compiled By will summarize the days events, in tongue-in-cheek style. As they claim: “Think of it as “Talk Soup” for the OSS world.”

Kevin Shockey

AddThis Social Bookmark Button

Related link: http://www.spikesource.com/news/march31.html

“SpikeSource believes that there is a renaissance under way for software where corporations are not beholden to specific vendor solutions and where best-of-breed assembly helps companies run their businesses better,” said Kim Polese, SpikeSource CEO.

With their latest announcement, SpikeSource announces their alliance with the Open Source Development Laboratory. This move follows their recent announcement of the dream team advisory board. Reading like a who is who of the open source industry these announcements build great momentum as they head into the Open Source Business Conference.

Tomorrow I expect SpikeSource to announce the general availability of their SpikeSource Core stack, i.e. a LAMJ stack. After approximately four months as a beta offer, the general availability of their product will mark the formal arrival of SpikeSource. I, for one, am anxiously waiting more details about their complete product and service offers. To date, the amount of information available has been sparce and they have remained some what aloof. In general, I believe that they have much more going on then is apparent.

As for a rebirth of software for corporations. Well…. I believe that corporations will almost always be beholden to specific vendor solutions. As time goes by I become more sure that we will have a completely heterogeneous information technology future. In every sense of this phrase. Open source software will not eliminate all other forms of software. It goes against everything that 40 years of history has taught us. In addition, as long as there are niche solutions, corporations will accept whatever technology those solutions bring. I remember receiving a wide spectrum of technology whenever we purchased an “integrated” solution. Whatever technology that solution required, we received. Whether this will change as software continues it’s march towards commoditization is yet to be seen.

Do you think that SpikeSource will become the new Dell?

Kevin Shockey

AddThis Social Bookmark Button

Related link: http://www.itgroundwork.com/news/pr_033005_GW85.html

As I head to the Open Source Business GroundWork Open Source Solutions, Inc., a leader in open source-based IT management solutions, announced it has secured $8.5 million in its second round of venture capital financing. The Series B financing, which was led by Mayfield, includes Series A lead investor Canaan Partners.

Once again we find yet another open source company with venture financing. According to a quick summary, Matt Asay suggests over $150 million has been invested in open source related companies since March 2004. I’ll agree with him, that is some serious coin. I don’t have the details, but that number seems plausible, if maybe a little high. In the end, the take away is open source is very important and is changing the information technology industry.

As the newest member of this prestigious group, Groundwork is looking to use the investment to to expand product and service offerings and build out its marketing and sales programs. Currently GroundWork offers an open source IT infrastructure monitoring solution that delivers enterprise-class availability and performance. GroundWork Monitor is their monitoring solution, and it is based on powerful open source software such as:

    Nagios (monitoring)
    Jetspeed (Web portal framework)v
    Ntop (network traffic analysis)
    Syslog NG (log file analysis, consolidation and filtering)
    RRDtool (analytical graphing)
    Cacti (network and systems performance graphing)
    MySQL (relational database)
    Nmap (network discovery)
    (Red Hat) Linux (operating system)

With what seems like a solid product and a nice base of 40 customers, I imagine that getting the word out will be task numero uno for GroundWork. “GroundWork has already demonstrated strong customer traction in the market, offering a very viable and compelling open source alternative to commercial IT management solutions,” said Robin Vasan, managing director of Mayfield. “As enterprises continue to embrace open source for more mission-critical applications, we believe GroundWork has the potential to capture a significant portion of the $7 billion worldwide IT management market,” he added.

Who will be next to get some help?

Uche Ogbuji

AddThis Social Bookmark Button

Related link: http://www.jensofsweden.com/

My wife has a 4GB iPod I bought her two years ago for her birthday. It’s slick, and she loves it. I’ve always wanted to wait for the right music player before taking the plunge myself, but no model of iPod has ever really tempted me, least of all the iPod shuffle. Looking elsewhere, I think I found my dream player.

A recent comment in the O’Reilly Weblogs directed me to Jens of Sweden. All I could say is “wow”. The Jens players are svelte, gorgeous, reasonably priced, and very functional. The one that caught my eye was the MP-120. If I’m doing my currency conversion rightly, it’s a great deal at just under $200 MSRP for 1GB of songs, as well as all its other great features.

The MP-120 is very compact (certainly on the iPod shuffle scale) and yet seems to has a nice display. Sorry, but there’s no way in hell I’m buying a music player without a display. Even more importantly, it plays OGG Vorbis, hands down my music format of choice, which means I won’t have to go to additional lengths to encode my songs for it (the suggestions for playing OGG Vorbis on iPod read like encephalectomy instructions). It’s designed to work well in Linux as well as Windows (not that I care) and OS X (I would like to grab some songs from my wife’s collection on Mac). It uses USB 2.0 (high speed) for data transfer. All perfect for me. I checked around the Net for user reviews, and they seem to be very positive.

But there’s a catch. I can’t seem to find the bloody things anywhere in the US. I think I found some places that stock the older MP-110, but not my particular dream kit. Preliminary search hasn’t revealed any vendor that clearly would sell/ship to me in the U.S. I guess worst case I can keep my eyes peeled while in Amsterdam for XTech 2005. But there’s very little doubt about it, from the specs and reviews I see. I’ll be having one of those sweet machines this year. I guess side by side with Lori’s iPod, I’ll at least be able to make a more informed comparison.

Do you know how to get an MP-120 in the U.S.?

Advertisement