===========================================
Reusing Browser Technology - Driller (MFC)
===========================================
Last Updated: Sep.19,1999
SUMMARY
========
There are two Drill samples, both do the same thing, however one is
written using MFC and the other is written using Visual Basic.
The Driller (MFC) sample is an MFC based control host, that hosts the
WebBrowser Control as part of another application.
The buttons and address input are supplied by the hosting application and
commands are sent to the WebBrowser control on the form. Entering a URL in
the address field on the form will result in the WebBrowser control navigating
to that page.
Additionally, the Drill samples show how a hosting application can "drill"
into the WebBrowser control and investigate the loaded HTML document. In
this case the host walks the ALL collection of the document object of the
loaded html page using Dynamic HTML and populate a list box with each element
encountered.
DETAILS
========
The functionality of drilling into the document hosted and listing the tags
found in the listbox is discussed in the readme file for Driller.
The Driller sampler adds additional functionality by providing extra control
through the implementation of the IDocHostUIHandler interface. This sample
shows how this interface can be used to control the context menus and extend
the Dynamic HTML Object Model.
USAGE
======
The Context menus that appear as standard for a right mouse click are
disabled within the Driller sample. This is achieved by returning S_OK
from the ShowContextMenu method of the IDocHostUIHandler interface,
indicating that the host has handled the call and the IE components
need not perform the standard processing.
The Dynamic HTML Object Model is also extended in the Driller sample
by providing an IDispatch interface to the getExternal method of the
IDocHostUIhandler interface. This IDispatch is used whenever the script
within the HTML document refers to window.external; whatever follows this
will be handed to the GetIDsof Names member function of the IDispatch
interface to be resolved. This can be seen by loading the extend.htm file
supplied into Driller and pressing the Extend button.
To implement the IDocHostUIHandler, the client site needs to implement
the interface. In MFC, the class COleControlSite encapsulates the client
site. In this example we are subclassing MFC, a class CCustomControlSite,
is derived from COleControlSite and CCustomControlSite implements IDocHostUIHandler.
To hook in the CCustomControlSite, a class CCustomOccManager is derived from
COccManager. Subclassing COccManager and COleControlSite in such a manner is
implementation specific to MFC. So, if future versions of MFC change the
implementation of COleControlSite or COccManager, this sample (and your code
if you use this technique) might not work. We are looking at possible ways to
have MFC expose the client site for customization. If in future MFC does expose
the client site, then we will modify this sample to use that functionality.
If you are using this sample to design your code please be advised that you
may have to change your code in the future.
To compile the 'Driller' sample, make sure you do not include the Headers and Libs
in your directory path.
It is possible that this sample will not compile because an include directory
has not been identified. The program looks for the file called occimpl.h.
Prepended to the header file is the directory where that file was located on
the test machine. This may be different than most users. The solution is to
delete the prepended directory and under Project->Settings, go to the C/C++
tab, look under Preprocessor and then add the include directory to the additional
include directories. An example of what that might look like is:
c:\program files\devstudio\vc\mfc.
OTHER
======
The sample is an vc++6.0 MFC based ActiveX control container and is a dialog based
application with control container support enabled.
The two functions of interest are OnBtnGo() and
OnBtnDrill(). Both are in the file drillerDlg.cpp.
OnBtnGo:
The function uses the Navigate2 method of the IWebBrowser2
interface to navigate to the URL.
OnBtnDrill()
This function shows how to walk the object model for
an HTML page. It does the equivalent of the following
in script:
cnt = document.all.length;
for (i=0;i<cnt;i++)
{
elem = document.all.item(i);
output(elem.tagName);
if (elem.tagName == "IMG")
output(elem.href);
}
The code in C++ will be as follows:
IDispatch* pDisp = pBrowser->GetDocument();
The document property gives access to the object model
for the HTML document. From here on one can walk
down the dynamic HTML object model.
For example, getting the IHTMLDocument2 interface
is equivalent to getting to the document object
in the object model. The document object exposes
the HTML document through a number of collections
and properties.
All the interfaces for the object model are defined
in MSHTML.h which is in the IE5 Headers.
Some of the methods of the IHTMLDocument2 interface are:
get_all( IHTMLElementCollection *p)
get_images( IHTMLElementCollection *p)
get_applets( IHTMLElementCollection *p)
get_links( IHTMLElementCollection *p)
get_forms( IHTMLElementCollection *p)
get_anchors( IHTMLElementCollection *p)
get_scripts( IHTMLElementCollection *p)
get_frames( IHTMLFramesCollection2 *p)
get_embeds( IHTMLElementCollection *p)
get_plugins( IHTMLElementCollection *p)
Most of these methods returns a collection which corresponds to the
collection in the object model. For example get_all returns the all
collection,which is a collection of all the elements in the document.
Similarly get_images returns the images collection, get_links the
links collection, get_forms the forms collection and so on.
IDispatch* pDisp = pBrowser->GetDocument();
if (pDisp != NULL )
{
IHTMLDocument2* pHTMLDocument2;
HRESULT hr;
hr = pDisp->QueryInterface( IID_IHTMLDocument2,
(void**)&pHTMLDocument2 );
if (hr == S_OK)
{
IHTMLElementCollection* pColl;
hr = pHTMLDocument2->get_all( &pColl );
Using get_all we get the all collection of the object model.
This is the same as doing allColl = document.all from within
script.
Below is how to walk the all collection. The same
technique can be used to walk any collection. The idea
is to get the length using get_length and then
iterate through each element.
if (hr == S_OK)
{
LONG celem;
hr = pColl->get_length( &celem );
if (hr == S_OK)
{
for ( int i=0; i< celem; i++ )
{
VARIANT varIndex;
varIndex.vt = VT_UINT;
varIndex.lVal = i;
VARIANT var2;
VariantInit( &var2 );
IDispatch* pDisp;
hr = pColl->item( varIndex,
var2, &pDisp );
The item method of the collection gives the element's dispatch
interface. Each element implements the IHTMLElement interface which
can be used to access the elements methods and properties.
Again look at the IHTMLElement interface in MSHTML.h for the
different methods that is has. There is a corresponding method
for every element property in the object model. So for example
if you want to get the tagName of the element use the get_tagName
method. The tagName is a the HTML tag of the element. For example
the tagName of the following line of HTML code:
<H1> This is Heading 1 </H1>
will be H1.
if (hr == S_OK)
{
IHTMLElement* pElem;
hr = pDisp->QueryInterface( IID_IHTMLElement, (void **)&pElem );
if (hr == S_OK)
{
BSTR bstr;
hr