Microsoft has spent a lot of time and effort trying to convince us to place Office documents on our Web sites -- witness the Microsoft Office 2000 Web Components, which are ActiveX controls designed to bring spreadsheet (Excel) and database (Access) functionality to a Web site. Whole books have been written about these Web components.
But Microsoft has spent much less effort pushing the tools they have created for going in the other direction -- placing the Web into our Office documents. This article shows one application for doing just that: importing stock share prices and indexes into an Excel spreadsheet. This is something I decided to do recently to keep track of my meager financial assets -- a few stocks, some bonds, and, of course, the usual savings/checking accounts. I have used MyYahoo to keep track of stocks, but it does not provide the range of formatting possibilities that Excel does, and I cannot do such things as compute my net worth (and I use the term a bit laughingly).
The problem, of course, is how to get updated stock data into an Excel worksheet automatically, either at regular intervals or with the click of a button. Not only do I want the current (read: delayed) price of my stocks (as well as other data such as the current change in price), but I also want historical stock prices so I can chart week-long, month-long or year-long performance for the stocks.
I devised a two-fold solution that uses Web queries and the WebBrowser control. It is a simple matter to set up a Web query that will retrieve stock information from Microsoft's MSN Web site. With a little more effort, we can use the WebBrowser control to get historical stock prices from, say, Yahoo. Of course, Excel excels at creating the desired performance charts.
Figure 1 shows the main part of my Excel application, simplified for this discussion (the stock list is fabricated and I actually have three charts -- one for 7-day performance, one for 30-day performance, and one for 180-day performance.
|Figure 1. As the active cell moves from row to row, the chart on the right changes to reflect the stock associated with that row.|
If you are interested in creating a similar application, read on.
The application consists of four worksheets.
|CSCO||Cisco Systems, Inc.|
|DELL||DELL Computer Corporation|
|IBM||International Business Machines Corporation|
|DOW||Dow Jones Industrial Average Index||$INDU|
|NASDAQ||Nasdaq Combined Composite Index||$COMPX|
|RUS2000||Russell 2000 Stock Index||$IUX|
We need this sheet for two reasons. First, we need to associate each stock symbol with the company name exactly as MSN knows it -- for example, "Cisco Systems, Inc." rather than just Cisco Systems. Although the Web query accepts stock symbols in its URL, it does not return those symbols. It returns only the full company name (along with the stock data, of course). So we need to have the exact name in order to find the information corresponding to a given stock symbol. (These names, by the way, can be found by just looking at the returned data in the Web query.)
In addition, MSN uses rather unintuitive symbols for stock indices, such as $IUX for the Russell 2000 index, whereas I would prefer something like RUS2000 in my main spreadsheet.
The general plan for the application is simple. Whenever the user clicks on a button to get stock info, we create a new Web query file and execute it, using the
Add method of the
QueryTables property of the WebQuery worksheet:
With Worksheets(conWebQuerySheet).QueryTables.Add(Connection:= _ "FINDER;" & sWebQueryFQFile, _ Destination:=Worksheets(conWebQuerySheet).Range("A1")) .Name = "Microsoft Investor Stock Quotes" End With
(There are some additional properties that you will find in the actual code.)
Once the Web query is complete, we can glean the required information from the WebQuery worksheet and fill a global array of type
Private Type utStocks Symbol As String MSNSymbol As String ' For indices, this differs from common symbol ' Data from table Name As String Price As String Change As String Volume As String High As String Low As String PrevClose As String YearHigh As String YearLow As String PERatio As String MarketCap As String EarningsPerShare As String SharesOut As String ' Computed ChangeSingle As Single PercentChanged As String ChangeIsNumeric As Boolean End Type
Once we have filled the array, we can do whatever we like with the data, including placing it on the Activity worksheet. You might also want to create a ticker-like display, as shown below, or do other things with the data.
As for historical data, this requires a bit more work. I am not aware of any Web queries that will retrieve such data, so we need to use a more brute-force approach. In particular, we use a WebBrowser control to browse to the appropriate URL at Yahoo, and then use a portion of the DHTML object model to pick apart the returned document and extract the historical prices.
Let's look at the details.
A Web query retrieves text or data from tables or other formatted areas on Web pages. Finding decent help information about Web queries from the Excel 2000 help files is as frustrating as you might expect, but fortunately we don't need any information; we can just mimic a Web query that comes with Excel.
In fact, a handful of Web queries are installed along with Excel. For Excel 2000 (and Excel XP), one of these Web queries retrieves stock information from the MSN Web site. The actual text file is named
Microsoft Investor Stock Quotes.iqy and is located in the Queries folder under the main Office folder. Its contents are:
WEB 1 http://investor.msn.com/external/excel/
fund or other MS Investor symbols separated by commas."] Selection=EntirePage Formatting=All PreFormattedTextToColumns=True ConsecutiveDelimitersAsOne=True SingleBlockTextImport=False
While I don't understand the meaning of each line, it really doesn't matter. The important part is the URL line. If we want to retrieve stock information for a specific list of stocks, we can programmatically create a text file, with the extension
.iqy, that contains each of these lines, but where the URL line is modified by replacing the prompt with a string of stock symbols, as in:
The last symbol ($INDU) specifies the Dow Jones Industrial Average. Other index symbols are contained in the LookupInfo worksheet in the accompanying Excel application.
|Figure 2: An example WebQuery|
For placing the Web into our Office documents, Microsoft provides the meager and very-poorly-documented WebBrowser control, which can be placed in an Office document to gain access to the Web.
Of course, once the Web browser is on, say, an Excel spreadsheet, we probably want to extract the data from a particular Web page and place it into the cells of the spreadsheet. For this, there is Dynamic HTML, which provides an object model that enables us to extract the data on the Web page.
|Figure 3: A portion of the Historical worksheet, with the WebBrowser control on the right.|
To add the WebBrowser control to an Excel spreadsheet, display the Control Toolbox toolbar and choose the More Controls button (the last button). Then search for Microsoft WebBrowser control. If you don't find it, then choose the last item, which is Register Custom Control and search for the file
SHDOCVW.DLL in your Windows System32 folder.
Once the control is placed on your spreadsheet, you may need to set its Locked property to False (the default seems to be True) so you can enter logon information for Yahoo (or whatever Web page you want to access).
Next, you must remember to set a reference to the Microsoft HTML Object Model (under the References item on the Tools menu in the VBA editor), in order to programmatically control the Web browser's contents.
Let me also mention that the WebBrowser control will automatically return to the previous URL every time the workbook is opened. If you want to prevent this (as I do), then you can add the following line to the Workbook's Open event:
Private Sub Workbook_Open() Worksheets("Historical").WebBrowser1.Navigate "about:blank" End Sub
I should also mention that the WebBrowser control seems to work fairly well in Excel 2000, but I have had a lot of trouble with it under Excel 97, with many General Protection Faults.
The WebBrowser control and the HTML object model are very poorly documented, even by Microsoft's standards. So here is what you need to know (you can get a list of the objects in the model by searching for the MSDN index entry DHTML Objects):
Private mDocument As MSHTML.HTMLDocument
Allproperty of the
HTMLDocumentobject returns the collection of all HTML elements. (The
Allproperty has about 70 entries in the MSDN library index, but none seem to relate to the
HTMLDocumentobject. Nevertheless, the property does work.)
Dim elem As Object
or the specific declaration
Dim elem As MSHTML.HTMLBaseElement
As usual, there are two potential disadvantages to using the generic declaration. One is performance, but this is not an issue for this type of application. The other is loss of Intellisense. Since the documentation is so poor, Intellisense is a big help, so I opt for the specific type declarations whenever possible. (This is how I found the
All property of the
tagNameproperty that helps identify the type of element. Since HTML tables have
tagNameTABLE, we can identify the tables in the document. A Table variable is declared as follows:
Dim tbl As MSHTML.HTMLTable
Tableobject, if you prefer) has a
Rowsproperty that returns the collection of
HTMLTableRowobjects, as in:
Set tr = tbl.Rows(0)
Rows collection has a
Length property that returns the number of rows in the table. (As you probably know, almost every Microsoft collection object uses a
Count property for this purpose, but why be consistent?)
HTMLTableRowobject has a
Cellsproperty that retrieves the collection of all cells in the row. Each cell can be accessed by index, as in:
Cells collection also has a
Length property that returns the number of cells in the row.
The WebBrowser control itself has a
DocumentComplete event that fires when the complete Web document has arrived, so we can set the
mDocument variable, in that event. This is also we here can dissect the Web page to get the data we want. The code for this event is shown below, but before we get to that, let us take a look at Yahoo's Web page.
c specify the month, day, and year of the starting date, and the parameters
f specify the month, day, and year of the ending date. The parameter
g specifies the frequency of stock quotes:
w (weekly), or
m (monthly). The parameter
s specifies the symbol. I don't know what the parameters
z specify, but it seems that we need to set
y to 0, and we can ignore
Of course, we need to change the URL and do a separate browse for each stock. Figure 3 shows only two stocks. (I use the blank cell in column A to stop the browsing.) In each case, we remove the data from the Web page and place it in the worksheet.
The page returned from Yahoo contains a second table below the input form, as shown in Figure 3. This table holds the historical stock data. The
MHTML object model can be used to grab that data, as shown in the
DocumentComplete event. I have used the fact that the table of historical data has the word "Date" in its first cell to identify it amongst the seven tables that are actually in this Web page (most of which don't look like tables).
Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant) ' Get the data from the WebBrowser control ' Columns of table Const conHistColDate = 0 Const conHistColHigh = 2 Const conHistColLow = 3 Const conHistColClose = 4 Const conHistColVol = 5 ' DHTML objects Dim elem As MSHTML.HTMLBaseElement Dim tbl As MSHTML.HTMLTable Dim tr As MSHTML.HTMLTableRow Dim iSourceRow As Integer Dim iTargetRow As Integer Dim iFreeCol As Integer Dim dtDate As Date Dim sDate As String ' Get out if not getting data If Not mbGetData Then Exit Sub ' Get reference to completed document Set mDocument = WebBrowser1.Document If mDocument Is Nothing Then Exit Sub ' Look for table of historical data ' This is the table that has "Date" in first cell Set tbl = Nothing For Each elem In mDocument.All ' Table element If elem.tagName = "TABLE" Then ' Get the table Set tbl = elem ' Does table have at least one row? If tbl.Rows.Length >= 1 Then Set tr = tbl.Rows(iSourceRow) If tr.Cells.Length >= 1 Then ' Does the first cell contain word "Date"? If tr.Cells(0).innerText = "Date" Then Exit For End If End If End If End If Next If tbl Is Nothing Then MsgBox "Could not find table of historical data", vbCritical Exit Sub End If ' Find the first available column For iFreeCol = 1 To 255 If Me.Cells(3, iFreeCol) = "" Then Exit For Next ' Place symbol in first col for this stock Me.Cells(2, iFreeCol).Value = msSymbol ' Column headings below symbol Me.Cells(3, iFreeCol).Value = "Date" Me.Columns(iFreeCol).NumberFormat = "@" Me.Cells(3, iFreeCol + 1).Value = "Close" ' Loop through rows of HTML table iTargetRow = 3 For iSourceRow = 1 To tbl.Rows.Length - 1 ' Get the next row of the HTML table Set tr = tbl.Rows(iSourceRow) ' Skip if not full complement of columns If tr.Cells.Length >= 7 Then iTargetRow = iTargetRow + 1 Me.Cells(iTargetRow, iFreeCol).Value = tr.Cells(conHistColDate).innerText Me.Cells(iTargetRow, iFreeCol + 1).Value = tr.Cells(conHistColClose).innerText End If Next End Sub
I have covered the main points of my Excel application, and the accompanying application will allow you to fill in the details. If you find any interesting extensions of this application, please let me know. You can leave a comment on my Web site at http://www.romanpress.com.
Steven Roman is a professor emeritus of mathematics at the California State University, Fullerton.
Return to the .NET DevCenter.
Copyright © 2009 O'Reilly Media, Inc.