AddThis Social Bookmark Button

Print

Pulling Stock Quotes into Microsoft Excel
Pages: 1, 2

Web Queries

A Web query retrieves text or data from tables or other formatted areas on Web pages. Finding decent help information about Web queries from the Excel 2000 help files is as frustrating as you might expect, but fortunately we don't need any information; we can just mimic a Web query that comes with Excel.



In fact, a handful of Web queries are installed along with Excel. For Excel 2000 (and Excel XP), one of these Web queries retrieves stock information from the MSN Web site. The actual text file is named Microsoft Investor Stock Quotes.iqy and is located in the Queries folder under the main Office folder. Its contents are:

WEB
1
http://investor.msn.com/external/excel/
quotes.asp?SYMBOL=["QUOTE0","Enter stock,
fund or other MS Investor symbols separated by commas."] Selection=EntirePage Formatting=All PreFormattedTextToColumns=True ConsecutiveDelimitersAsOne=True SingleBlockTextImport=False

While I don't understand the meaning of each line, it really doesn't matter. The important part is the URL line. If we want to retrieve stock information for a specific list of stocks, we can programmatically create a text file, with the extension .iqy, that contains each of these lines, but where the URL line is modified by replacing the prompt with a string of stock symbols, as in:

http://investor.msn.com/external/excel/quotes.asp?SYMBOL=IBM,DELL,$INDU

The last symbol ($INDU) specifies the Dow Jones Industrial Average. Other index symbols are contained in the LookupInfo worksheet in the accompanying Excel application.

Figure 2
Figure 2: An example WebQuery

The WebBrowser Control

For placing the Web into our Office documents, Microsoft provides the meager and very-poorly-documented WebBrowser control, which can be placed in an Office document to gain access to the Web.

Of course, once the Web browser is on, say, an Excel spreadsheet, we probably want to extract the data from a particular Web page and place it into the cells of the spreadsheet. For this, there is Dynamic HTML, which provides an object model that enables us to extract the data on the Web page.

Figure 3
Figure 3: A portion of the Historical worksheet, with the WebBrowser control on the right.

To add the WebBrowser control to an Excel spreadsheet, display the Control Toolbox toolbar and choose the More Controls button (the last button). Then search for Microsoft WebBrowser control. If you don't find it, then choose the last item, which is Register Custom Control and search for the file SHDOCVW.DLL in your Windows System32 folder.

Once the control is placed on your spreadsheet, you may need to set its Locked property to False (the default seems to be True) so you can enter logon information for Yahoo (or whatever Web page you want to access).

Next, you must remember to set a reference to the Microsoft HTML Object Model (under the References item on the Tools menu in the VBA editor), in order to programmatically control the Web browser's contents.

Let me also mention that the WebBrowser control will automatically return to the previous URL every time the workbook is opened. If you want to prevent this (as I do), then you can add the following line to the Workbook's Open event:

Private Sub Workbook_Open()
 Worksheets("Historical").WebBrowser1.Navigate "about:blank"
End Sub

I should also mention that the WebBrowser control seems to work fairly well in Excel 2000, but I have had a lot of trouble with it under Excel 97, with many General Protection Faults.

The WebBrowser control and the HTML object model are very poorly documented, even by Microsoft's standards. So here is what you need to know (you can get a list of the objects in the model by searching for the MSDN index entry DHTML Objects):

  • To refer to a document that has been browsed to by the WebBrowser control, use a variable declared as follows:
    Private mDocument As MSHTML.HTMLDocument
    
  • Each item on a Web page of relevance to us is an HTML Element. The All property of the HTMLDocument object returns the collection of all HTML elements. (The All property has about 70 entries in the MSDN library index, but none seem to relate to the HTMLDocument object. Nevertheless, the property does work.)
  • We can declare an element variable using the generic declaration
  • Dim elem As Object
    

    or the specific declaration

    Dim elem As MSHTML.HTMLBaseElement

    As usual, there are two potential disadvantages to using the generic declaration. One is performance, but this is not an issue for this type of application. The other is loss of Intellisense. Since the documentation is so poor, Intellisense is a big help, so I opt for the specific type declarations whenever possible. (This is how I found the All property of the Document element.)

  • Each element has a tagName property that helps identify the type of element. Since HTML tables have tagName TABLE, we can identify the tables in the document. A Table variable is declared as follows:
    Dim tbl As MSHTML.HTMLTable
    
  • The Table element (or Table object, if you prefer) has a Rows property that returns the collection of HTMLTableRow objects, as in:
    Set tr = tbl.Rows(0)
    

    The Rows collection has a Length property that returns the number of rows in the table. (As you probably know, almost every Microsoft collection object uses a Count property for this purpose, but why be consistent?)

  • The HTMLTableRow object has a Cells property that retrieves the collection of all cells in the row. Each cell can be accessed by index, as in:
    tr.Cells(0)
    

    The Cells collection also has a Length property that returns the number of cells in the row.

  • We can get the text in a cell by using the code
    tr.Cells(0).innerText
    

The WebBrowser control itself has a DocumentComplete event that fires when the complete Web document has arrived, so we can set the mDocument variable, in that event. This is also we here can dissect the Web page to get the data we want. The code for this event is shown below, but before we get to that, let us take a look at Yahoo's Web page.

If you browse to the URL http://chart.yahoo.com/t, you will get a small form like the one that appears at the top of the Web Control in Figure 3. This form can be used to specify the range of dates, the price frequency (daily, weekly, monthly) and the stock symbol. Here is an example of a URL sent to Yahoo:

http://chart.yahoo.com/t?a=01&b=01&c=01&d=10&e=26&f=01&g=d&s=ibm&y=0&z=ibm

The parameters a, b, and c specify the month, day, and year of the starting date, and the parameters d, e, and f specify the month, day, and year of the ending date. The parameter g specifies the frequency of stock quotes: d (daily), w (weekly), or m (monthly). The parameter s specifies the symbol. I don't know what the parameters y and z specify, but it seems that we need to set y to 0, and we can ignore z.

Of course, we need to change the URL and do a separate browse for each stock. Figure 3 shows only two stocks. (I use the blank cell in column A to stop the browsing.) In each case, we remove the data from the Web page and place it in the worksheet.

The page returned from Yahoo contains a second table below the input form, as shown in Figure 3. This table holds the historical stock data. The MHTML object model can be used to grab that data, as shown in the DocumentComplete event. I have used the fact that the table of historical data has the word "Date" in its first cell to identify it amongst the seven tables that are actually in this Web page (most of which don't look like tables).

Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)

' Get the data from the WebBrowser control

' Columns of table
Const conHistColDate = 0
Const conHistColHigh = 2
Const conHistColLow = 3
Const conHistColClose = 4
Const conHistColVol = 5

' DHTML objects
Dim elem As MSHTML.HTMLBaseElement
Dim tbl As MSHTML.HTMLTable
Dim tr As MSHTML.HTMLTableRow

Dim iSourceRow As Integer
Dim iTargetRow As Integer
Dim iFreeCol As Integer
Dim dtDate As Date
Dim sDate As String

' Get out if not getting data
If Not mbGetData Then Exit Sub

' Get reference to completed document
Set mDocument = WebBrowser1.Document
If mDocument Is Nothing Then Exit Sub

' Look for table of historical data
' This is the table that has "Date" in first cell
Set tbl = Nothing
For Each elem In mDocument.All

   ' Table element
   If elem.tagName = "TABLE" Then
      ' Get the table
      Set tbl = elem
      ' Does table have at least one row?
      If tbl.Rows.Length >= 1 Then
         Set tr = tbl.Rows(iSourceRow)
         If tr.Cells.Length >= 1 Then
            ' Does the first cell contain word "Date"?
            If tr.Cells(0).innerText = "Date" Then
               Exit For
            End If
         End If
      End If
   End If

Next

If tbl Is Nothing Then
   MsgBox "Could not find table of historical data", vbCritical
   Exit Sub
End If

' Find the first available column
For iFreeCol = 1 To 255
   If Me.Cells(3, iFreeCol) = "" Then Exit For
Next


' Place symbol in first col for this stock
Me.Cells(2, iFreeCol).Value = msSymbol

' Column headings below symbol
Me.Cells(3, iFreeCol).Value = "Date"
Me.Columns(iFreeCol).NumberFormat = "@"

Me.Cells(3, iFreeCol + 1).Value = "Close"

' Loop through rows of HTML table
iTargetRow = 3
For iSourceRow = 1 To tbl.Rows.Length - 1

   ' Get the next row of the HTML table
   Set tr = tbl.Rows(iSourceRow)
   
   ' Skip if not full complement of columns
   If tr.Cells.Length >= 7 Then

      iTargetRow = iTargetRow + 1
     
      Me.Cells(iTargetRow, iFreeCol).Value = tr.Cells(conHistColDate).innerText
      Me.Cells(iTargetRow, iFreeCol + 1).Value = tr.Cells(conHistColClose).innerText
      
   End If

Next

End Sub

Related Reading

VB .NET Language in a NutshellVB .NET Language in a Nutshell
By Steven Roman, Ron Petrusha & Paul Lomax
Table of Contents
Index
Full Description
Read Online -- Safari

Conclusion

I have covered the main points of my Excel application, and the accompanying application will allow you to fill in the details. If you find any interesting extensions of this application, please let me know. You can leave a comment on my Web site at http://www.romanpress.com.

Steven Roman is a professor emeritus of mathematics at the California State University, Fullerton.


Return to the .NET DevCenter.